Symbol-level combining for multiple input multiple output (MIMO) systems with hybrid automatic repeat request (HARQ) and/or repetition coding

ABSTRACT

Systems and methods are provided for decoding signal vectors in multiple-input multiple-output (MIMO) systems, where the receiver has received one or more signal vectors from the same transmitted vector. The symbols of the received signal vectors are combined, forming a combined received signal vector that may be treated as a single received signal vector. The combined signal vector is then decoded using a maximum-likelihood decoder. In some embodiments, the combined received signal vector may be processed prior to decoding. Systems and methods are also provided for computing soft information from a combined signal vector based on a decoding metric. Computationally intensive calculations can be extracted from the critical path and implemented in preprocessors and/or postprocessors.

CROSS-REFERENCE TO RELATED APPLICATIONS

This present disclosure is a continuation of U.S. application Ser. No. 11/781,208, filed on Jul. 20, 2007, which claims priority under 35 U.S.C. §119(e) to U.S. Provisional Application Nos. 60/820,419, filed Jul. 26, 2006, 60/822,294, filed Aug. 14, 2006, and 60/822,821, filed Aug. 18, 2006.

BACKGROUND OF THE INVENTION

This invention relates to a technique for decoding a received signal vector in a multiple-input multiple-output (MIMO) data transmission or storage system, where the receiver may receive multiple instances of the same transmitted signal vector.

In a data transmission or storage system, it is desirable for information, often grouped into packets, to be accurately received at a destination. A transmitter at or near the source sends the information provided by the source via a signal or signal vector. A receiver at or near the destination processes the signal sent by the transmitter. The medium, or media, between the transmitter and receiver, through which the information is sent, may corrupt the signal such that the receiver is unable to correctly reconstruct the transmitted information. Therefore, given a transmission medium, sufficient reliability is obtained through careful design of the transmitter and receiver, and of their respective components.

There are many strategies for designing the transmitter and receiver. When the channel characteristics are known, the transmitter and receiver often implement signal processing techniques, such as transmitter precoders and receiver equalizers, to reduce or remove the effects caused by the channel and effectively recover the transmitted signal. Intersymbol interference (ISI) is one example of a channel effect that may be approximately eliminated using signal processing.

However, not all sources of signal corruption are caused from deterministic sources such as ISI. Non-deterministic sources, such as noise sources, may also affect the signal. Due to noise and other factors, signal processing techniques may not be entirely effective at eliminating adverse channel effects on their own. Therefore, designers often add redundancy in the data stream in order to correct errors that occur during transmission. The redundancy added to the data stream is determined based on an error correction code, which is another design variable. Common error correction codes include Reed-Solomon and Golay codes.

One straightforward way to implement a code is to use forward error correction (FEC). The transmitter encodes the data according to an error correction code and transmits the encoded information. Upon reception of the data, the receiver decodes the data using the same error correction code, ideally eliminating any errors. Therefore, “decoding” is hereinafter referred to as a method for producing an estimate of the transmitted sequence in any suitable form (e.g., a binary sequence, a sequence of probabilities, etc.)

Another way to implement a code for error correction is to use automatic repeat request (ARQ). Unlike FEC, ARQ schemes use error-detecting rather than error-correcting codes. The ARQ transmitter encodes data based on an error-detecting code, such as a cyclic redundancy check (CRC) code. After decoding the data based on the error-detecting code, if an error is detected, the receiver sends a request to the transmitter to retransmit that codeword. Thus, ARQ protocols require a forward channel for communication from transmitter to receiver and a back channel for communication from receiver to transmitter. Ultimately, the receiver will not accept a packet of data until there are no errors detected in the packet.

Finally, FEC and ARQ may be combined into what is known as hybrid automatic repeat request (HARQ). There are at least three standard HARQ protocols. HARQ type-I typically uses a code that is capable of both error-correction and error-detection. For example, a codeword may be constructed by first protecting the message with an error-detecting code, such as a CRC code, and then further encoding the CRC-protected message with an error-correcting code, such as a Reed-Solomon, Golay, convolutional, turbo, or low-density parity check (LDPC) code. When the receiver receives such a code, it first attempts FEC by decoding the error correction code. If, after error detection, there are still errors present, the receiver will request a retransmission of that packet. Otherwise, it accepts the received vector.

HARQ type-II and type-III are different from HARQ type-I, because the data sent on retransmissions of a packet are not the same as the data that was sent originally. HARQ type-II and type-III utilize incremental redundancy in successive retransmissions. That is, the first transmission uses a code with low redundancy. The code rate of a code is defined as the proportion of bits in the vector that carry information and is a metric for determining the throughput of the information. Therefore, the low redundancy code used for the first transmission of a packet has a high code rate, or throughput, but is less powerful at correcting errors. If errors are detected in the first packet, the second transmission is used to increase the redundancy, and therefore the error correcting capability, of the code. For example, if the first transmission uses a code with a code rate of 0.80, a retransmission may add enough extra redundancy to reduce the overall code rate to 0.70. The redundancy of the code may be increased by transmitting extra parity bits or by retransmitting a subset of the bits from the original transmission. If each retransmission can be decoded by itself, the system is HARQ type-III. Otherwise, the system is HARQ type-II.

SUMMARY OF THE INVENTION

Accordingly, systems and methods for reliable transmission in multiple-input multiple-output systems are disclosed, where a receiver obtains multiple signal vectors from the same transmit signal and combines them prior to decoding.

The transmitter, which has N_(t) outputs, may send an N_(t)-dimensional signal vector to the receiver. The receiver, which has N_(r) inputs, may receive an N_(r)-dimensional signal vector corresponding the N_(t)-dimensional transmit vector. In accordance with one aspect of the invention, the transmitter sends the same signal vector multiple times to the receiver according to some protocol. Two protocols that may be used are HARQ type-I and repetition coding, or a combination of the two.

It is beneficial for an ARQ or HARQ receiver to utilize data from multiple transmissions of a packet, because even packets that contain errors carry some amount of information about the transmitted packet. However, due to system complexity, and in particular decoder complexity, many practical schemes only use data from a small, fixed number of transmissions. Therefore, the present invention provides systems and methods for effectively utilizing information from an arbitrary number of transmitted packets that does not drastically increase the complexity of the system.

In one embodiment of the present invention, when the receiver has N≧1 received signal vectors corresponding to a common transmit signal vector, the receiver combines the symbols of the received signal vectors. That is, the receiver combines symbols of the N received signal vectors that correspond to the same symbol of the common transmit signal vector. This technique is referred to as symbol-level combining. The signal vectors may be combined by weighted addition of the symbols. In some embodiments, the weights may be chosen to maximize the signal-to-noise ratio at the receiver, a technique that may be referred to as maximal-ratio combining (MRC). Symbol-level combining produces a combined signal vector of the same dimension as the common transmit signal vector. The combined signal vector can be modeled as a single received vector, affected by some channel, represented by a combined channel response matrix, and some noise components, referred to as the combined noise. The combined noise may or may not be white noise. That is, the noise may or may not have a distribution with a flat power spectral density. If the combined noise is white, then the combined signal vector may be directly decoded by a decoder, such as a maximum-likelihood (ML) decoder.

However, in some embodiments, the noise of the combined received signal vector is not white. In these embodiments, the combined signal vector may be processed by a signal processor that whitens the noise. The signal processor may use channel information obtained from a channel preprocessor that operates on the combined channel matrix. After the combined signal vector is processed, the processed combined vector may be decoded by a decoder, such as an ML decoder. The ML decoder may also use channel information obtained from the channel preprocessor. For example, the decoding metric calculated by the decoder may be ∥y′−{tilde over (H)}^(1/2)x∥². In some embodiments, the signal processor may additionally process the combined signal vector in order to reduce the decoding complexity of the system.

In some embodiments of the present invention, a channel preprocessor may perform a Cholesky factorization of a combined channel matrix. In particular, if the combined channel matrix is {tilde over (H)}, the preprocessor may decompose {tilde over (H)} into a lower triangular matrix L, and an upper triangular matrix, L*. To whiten the noise of a combined signal vector combined using MRC, a signal processor may multiply the combined received signal vector by L⁻¹. The resulting decoding metric is ∥L⁻¹{tilde over (y)}_(N)−L*x∥². Because L* is an upper triangular matrix rather than a full matrix, such as {tilde over (H)}, the decoding complexity may be reduced considerably.

In accordance with another aspect of the present invention, a decoding strategy is provided for decoding a combined signal vector based on a decoding metric. The decoding metric may be factored into two parts: 1) a simplified decoding metric which may be a function of channel information and x, the common transmit signal vector being determined, and 2) a modifier that may be function of channel information. For example, if the decoding metric is ∥L⁻¹{tilde over (y)}_(N)−L*x∥² for a 2-input, 2-output MIMO system, also referred to as a 2×2 system, the simplified decoding metric may be given by {tilde over (D)}=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ⁽²⁾)}{tilde over (L)}*X∥² and the modifier may be given by

$\frac{1}{h_{11}h_{11}^{(2)}}.$ The channel information in the simplified decoding metric may be computed by a preprocessor. A simplified LLR, using the information from the preprocessor and the simplified decoding metric, may be calculated at substantially the same time as the modifier. Then, the simplified modifier and the simplified LLR may be combined using a postprocessor. This decoding strategy is advantageous because it removes computation from the most time-intensive and/or complex calculation in the decoder, the calculation that is repeatedly performed (e.g., ∥y′−{tilde over (H)}^(1/2)x∥² that is performed for all valid values of x).

In some embodiments, the simplified decoding metric may be a linear approximation of the decoding metric. For example, the simplified decoding metric may be {tilde over (D)}=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ⁽²⁾)}{tilde over (L)}*X∥². The modifier, in this case, may be adjusted to

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}.$ The linear simplified decoding metric is significantly less complicated to implement, because the computation may be symbol-by-symbol based rather than vector-by-vector based. Furthermore, if the decoder is a hard decoder, the simplified LLR may be directly mapped to a hard value. Therefore, the modifier may not be calculated, saving even more in decoding complexity or decoding speed.

BRIEF DESCRIPTION OF THE FIGURES

The above and other objects and advantages of the invention will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:

FIG. 1 is a high level block diagram of a multiple-input multiple-output data transmission or storage system;

FIG. 2 is a wireless transmission system in accordance with one embodiment of the system in FIG. 1;

FIG. 3 is a block diagram of a transmitter;

FIG. 4A is a signal constellation set for quadrature amplitude modulation with four signal points;

FIG. 4B is a signal constellation set for quadrature amplitude modulation with 16 signal points;

FIG. 5 is a vector model of the system in FIG. 1;

FIG. 6A is a flow diagram of a stop-and-wait HARQ transmitter;

FIG. 6B is a flow diagram of a HARQ receiver;

FIG. 7 is a high level block diagram of a receiver;

FIG. 8 is a detailed embodiment of FIG. 7 for a single input, single output (SISO) system;

FIG. 9 is a diagram illustrating symbol-level combining in a 4-QAM system using weighted addition;

FIGS. 10A-10B show subsets of signal points in a 4-QAM signal constellation set;

FIGS. 11-12 show detailed embodiments of FIG. 7 for a MIMO system;

FIG. 13 shows a detailed embodiment of FIG. 7 for a MIMO system that utilizes QR decomposition;

FIG. 14 shows a detailed embodiment of FIG. 7 for a MIMO system that utilizes Cholesky factorization;

FIG. 15 shows an illustrative flow diagram for decoding a signal vector from a decoding metric;

FIG. 16 shows an illustrative flow diagram for decoding a signal vector in a 2×2 MIMO system employing the strategy of FIG. 15;

FIG. 17 shows an illustrative flow diagram for decoding a signal vector in a 3×3 MIMO system employing the strategy of FIG. 15;

FIG. 18A is a block diagram of an exemplary hard disk drive that can employ the disclosed technology;

FIG. 18B is a block diagram of an exemplary digital versatile disc that can employ the disclosed technology;

FIG. 18C is a block diagram of an exemplary high definition television that can employ the disclosed technology;

FIG. 18D is a block diagram of an exemplary vehicle that can employ the disclosed technology;

FIG. 18E is a block diagram of an exemplary cell phone that can employ the disclosed technology;

FIG. 18F is a block diagram of an exemplary set top box that can employ the disclosed technology; and

FIG. 18G is a block diagram of an exemplary media player that can employ the disclosed technology.

DETAILED DESCRIPTION

The disclosed invention provides a technique in a multiple-input multiple-output data transmission or storage system to decode a signal vector at a receiver, where the receiver may receive multiple signal vectors from the same transmitted signal vector.

FIG. 1 shows an illustration of a basic data transmission or storage system in accordance with one embodiment of the present invention. Data, typically grouped into packets, is sent from transmitter 102 to receiver 112. During transmission, the signals may be altered by a transmission medium, represented by channel 106, and additive noise sources 108. Transmitter 102 has N_(t) outputs 104 and receiver 112 has N_(r) inputs 110, so channel 106 is modeled as a multiple-input multiple-output (MIMO) system with N_(t) inputs and N_(r) outputs. The N_(t) input and N_(r) output dimensions may be implemented using multiple time, frequency, or spatial dimensions, or any combination of such dimensions.

In one embodiment, FIG. 1 represents a wireless communication system, pictured in FIG. 2. In this embodiment, transmitter 102 is a wireless server 204, such as a commercial gateway modem, and receiver 112 is a wireless receiver 206, such as a commercial wireless computer adapter. Channel 106 is space 208 between wireless server 204 and wireless receiver 206, which obstructs and attenuates the signal due to at least multipath fades and shadowing effects. Typically, wireless communication systems use spatial dimensions to implement multiple dimensions in the form of multiple transmitting antennas 200 and receiving antennas 202.

Returning to FIG. 1, transmitter 102 prepares bit sequence 100 into signals capable of transmission through channel 106. For an uncoded system, bit sequence 100 is a binary message, where the message carries only information bits. Alternatively, for a coded system, bit sequence 100 may be an encoded version of the message. Thus, bit sequence 100 may have originated from a binary data source or from the output of a source encoder (not pictured).

One embodiment of transmitter 102 is shown in FIG. 3. Transmitter 102 converts bit sequence 100 into signals 104 appropriate for transmission through channel 106 (FIG. 1). Bit sequence 100 is passed through interleaver/encoder 300, which may interleave and/or encode bit sequence 100. If interleaver/encoder 300 performs encoding, the encoding may be based on any suitable error control code (e.g., convolutional, block, error-detecting, error-correcting, etc.). If interleaving is performed, each bit in bit sequence 100 may be assumed to be independent of all other bits in bit sequence 100. Bit sequence 306 at the output of interleaver 300 is demultiplexed by demultiplexor 308 across N_(t) paths 310. Each demultiplexed output 310 may or may not go through another interleaver and/or coding block 302, yielding bit sequences 312. Finally, bit sequences 312 are modulated with modulators 304, and are transmitted as signals x₁, . . . , x_(Nt), or x in vector form.

Modulators 304 group the incoming bits into symbols, which are mapped and converted to signals according to a signal constellation set and carrier signal. In one embodiment of the invention, modulator 304 uses quadrature amplitude modulation (QAM). Each symbol is mapped to a signal point in the QAM signal constellation set, where the signal points are differentiated from one another by phase and/or magnitude. For example, FIG. 4A shows a 4-QAM signal constellation set in a complex number plane. In this case, signal points 400A-400D are distinguishable only by phase. Each signal point represents a different two-bit symbol 402A-402D: 400A represents “00” (402A), 400B represents “01” (402B), 400C represents “11” (402C), and 400D represents “10” (402D). However, any other one-to-one mapping from symbol to signal point is valid.

FIG. 4B shows a 16-QAM signal constellation set, where four-bit sequences 406 are combined into one symbol. Here, both the amplitudes and the phase of signal points 404 may vary. FIG. 4B shows a partial mapping from symbols 406 to signal points 404, where the each symbol is shown closest to its corresponding signal point. However, as before, any other mapping is possible. In general, an m-bit symbol may be mapped according to an M-QAM signal set, where M=2^(m). Therefore, for the transmitter configuration shown in FIG. 3, transmitter 102 is capable of transmitting in mN_(t) bits concurrently.

In accordance with one embodiment of the present invention, transmitter 102 sends the same vector, x, multiple times according to a protocol that is also known and followed by receiver 112. Depending on the protocol, there may be additional components in transmitter 102 that are not shown in FIG. 3. It should be understood that transmitter 102 may be altered in order to implement such protocols. For example, if an automatic repeat request (ARQ) protocol is used, transmitter 102 may need a buffer to store x, or equivalently bit stream 100, in the event that a retransmission is requested.

Even though x is transmitted, receiver 112 in FIG. 1 actually receives y_(i), where y _(i) =Hx+n _(i)1≦i≦N  (1) For clarity, FIG. 5 shows the components of each vector in equation (1). Index i represents the ith instance that the same transmitted vector, x, is transmitted. y_(i) is an N_(r)×1 vector, where each vector component is the signal received by one of the N_(r) inputs of receiver 112. H_(i) 500 is an N_(r)×N_(t) channel matrix that defines how channel 106 alters the transmitted vector, x. n_(i) is an N_(r)×1 vector of additive noise. Note that the characteristics of channel 106, reflected in matrix 500, and noise sources 108, and therefore received signal 110, may be different for each instance i. Differences arise because each transmission of x occurs at a different time or through a different medium.

In one embodiment, noise sources 108 may be modeled as additive white Gaussian noise (AWGN) sources. In this case, noise sources 108 are independent and identically distributed (i.i.d). That is, the noise that affects any of the N_(r) components in any n_(i) does not affect the noise for any other component in n_(i), and the noise at any given time does not affect the noise at any other time. Also, all of the noise sources have the same probabilistic characteristics. Furthermore, each component of n_(i) has zero mean and is random in terms of both magnitude and phase, where the magnitude and the phase are also independent. This type of noise source is called an i.i.d. zero mean circularly symmetric complex Gaussian (ZMCSCG) noise source. If the variance of each component is N₀, then the conditional probability distribution function (pdf) of the received signal, Pr{y|x,H}, is given by

$\begin{matrix} {{\Pr\left\{ {{y❘x},H} \right\}} = {\frac{1}{\left( {\pi\; N_{0}} \right)^{N}}\exp\left\{ {- \frac{{{y - {Hx}}}^{2}}{N_{0}}} \right\}}} & (2) \end{matrix}$ Equation (2) will be used with reference to maximum-likelihood decoding discussed in greater detail below in connection with FIG. 10.

Receiver 112 may use one or more of the N received copies of x to determine the information that was transmitted. Receiver 112 may combine multiple received vectors into a single vector for decoding, thereby utilizing more than one, and possibly all, of the transmitted signal vectors. The combining scheme disclosed in the present invention will be discussed in greater detail below in connection with FIGS. 7-11. It should be understood that the receiver in the present invention may combine all received signal vectors. Alternatively, a subset of the received signal vectors and channel matrices may be combined. For example, a received signal and the corresponding channel matrix may be discarded if the magnitude of a component in the received signal vector is below a certain threshold. Thus, the variable N should refer to the number of received signal vectors used by the receiver, which is not necessarily the same as the number of total signal vectors received.

In one embodiment of the invention, receiver 112 receives multiple instances of a common transmit vector using a retransmission protocol. For example, the transmitter and receiver may use a HARQ type-I protocol. A flow chart of the steps taken by transmitter 102 and receiver 112 are shown in FIG. 6A and FIG. 6B, respectively. FIG. 6A shows a transmitter following a stop-and-wait protocol, where the transmitter waits until a signal vector has been accepted by the receiver before sending the next signal vector. Other protocols, such as go-back-N, selective repeat, or any other suitable protocol may be used in place of stop-and-wait. Therefore, it should be understood that FIG. 6A may be modified in order to implement a different protocol.

FIG. 6B shows a simplified flow chart of a HARQ type-I receiver protocol in accordance with one aspect of the invention. At some time, receiver 112 receives y_(i) at step 600, corresponding to the ith transmission of x. At step 602, receiver 112 may combine all the signal vectors corresponding to transmitted signal x that have been received thus far, that is y₁, . . . , y_(i), into a single vector, {tilde over (y)}, and decodes the combined vector or a processed version of the combined vector. In FIG. 6B, decoding refers to determining the CRC-protected message based on the combined signal vector. Other possible decoding outputs will be discussed in greater detail below in connection with FIG. 7. Errors in individual signal vectors may be corrected by combining the received signal vectors such that the combined signal vector, {tilde over (y)}, is correctable by decoding. Following decoding, error detection is performed at step 604, which in this case involves checking the CRC of the decoded vector. If errors are detected, the receiver may send a negative acknowledgement (NACK) message to the transmitter at step 606. Upon receipt of the NACK, the transmitter may send the same transmitted signal vector, which is received at step 600 as y_(i+1). y_(i+1) may be different from y_(i) even though the same transmit signal vector x is used at the transmitter, because y_(i+1) is transmitted at a later time than y_(i) and is affected by different noise and/or channel characteristics. The i+1 vectors are combined and decoded, as described previously. This procedure occurs N times, until by combining and decoding N received vectors, no CRC error is detected. At this point, the receiver sends an acknowledgment (ACK) message at step 608 back to the transmitter to inform the transmitter that the vector has been successfully received. Also, since there are no errors in the decoded data, the receiver passes the decoded data to the destination at step 610.

In another embodiment of the invention, the transmitter sends a signal vector, x, a fixed number of times, irrespective of the presence of errors. For example, the receiver may obtain N transmissions of x from repetition coding. N copies of x may be transmitted simultaneously, or within some interval of time. The receiver combines signal vectors, y₁, . . . , y_(N), and may decode the combination or a processed version of the combination. Repetition coding may be useful when there is no feasible backchannel for the receiver to send retransmission requests.

HARQ type-I and repetition coding are two protocols that may be used in different embodiments of the present invention. Alternatively, repetition coding and HARQ can be combined such that multiple vectors are received at step 600 before combining and decoding at step 602. The invention, however, is not limited to the two protocols and their combination mentioned here. Currently, the IEEE 802.16e standard uses HARQ and repetition coding, so these particular protocols merely illustrate embodiments of the invention. Any protocol that allows the receiver to receive multiple copies of the same transmitted vector fall within the scope of the present invention.

FIG. 7 is a block diagram of one embodiment of receiver 112 in accordance with one aspect of the present invention. Furthermore, it illustrates one way to implement combining and decoding at step 602 in FIG. 6B. Combiner 702, which may or may not use channel information 718 provided from channel combiner 700, combines the symbols of the N received vectors using any suitable combining technique. This type of combining is hereinafter referred to as symbol-level combining, because the combiner operates on the symbols of the signal vector. Combined received vector 706, {tilde over (y)}, can be passed to signal processor 712. Signal processor 712 may process the combined received vector to produce a new signal vector with white noise components. If the noise is already white, signal processor 712 may be bypassed or omitted from the receiver, or may perform other processing functions on the combined received signal vector. Signal processor 712 may also use channel information 716 provided by channel combiner/preprocessor 700. After the noise of the combined received vector is whitened, the processed signal vector, y′, is decoded by decoder 704. Decoder 704 may use channel information 708 provided by combiner 700 to operate on processed signal vector 710, y′. Decoder 704 may return an estimate of the signal vector, x. Decoder 704 may return soft information or hard information. If decoder 704 returns hard information, it may have been the result of hard-decoding or soft-decoding. For a coded system, decoder 704 may return coded information or decoded information.

Single-input single-output (SISO) systems are a special case of MIMO systems in which N_(t)=N_(r)=1. System 800, in FIG. 8, shows a detailed embodiment of FIG. 7 for a SISO system. First, the signals are combined by weighted addition. Weights 820 may be chosen to maximize the signal-to-noise (SNR) ratio, a technique called maximal ratio combining (MRC). For MRC or other weighted addition combining, weights 820 may be functions of channel information 808 determined by combiner 800. Following combining by symbol combiner 802, combined received signal 806 may be decoded using maximum-likelihood (ML) decoder 804.

FIG. 9 shows an example of a weighted addition combining, HARQ receiver of the configuration shown in FIG. 8. The signal constellation set is 4-QAM, which was described above in connection with FIG. 4A. Signal points 900A-900D represent the magnitude and phase of transmitted symbols 902A-902D, respectively. For illustration purposes, assume that the transmitter is sending the symbol, “00” (902A), to the receiver using a HARQ type-I protocol. Assume, again for the purpose of illustration, that the channel does not attenuate, amplify, or alter the signal in any way. Therefore, ideally, a symbol with the magnitude and phase of signal point 900A would be received. However, if due to additive noise, a signal with a magnitude and phase of signal point 904 is actually received, it will be incorrectly decoded as “01,” because it is closer to signal point 900B than 900A. Note that an ML decoder may make this decision if the noise is assumed to be AWGN. The error-detecting code may then detect the presence of the bit error, resulting in a request for a retransmission. On the second transmission, a signal corresponding to signal point 906 is received. If signal point 906 is decoded on its own, it may be incorrectly decoded as “10.” However, by weighted addition of signal points 904 and 906, the resulting combined symbol may fall approximately on dotted line 908. The combined symbol is now closest to signal point 900A and will be decoded correctly as “00.” Thus, the receiver configuration shown in FIG. 8 may be used to effectively decode multiple received signal vectors.

Referring back to FIG. 8, a mathematical treatment of the combining scheme for a SISO system is considered. To maximize SNR, weights 820 may take on the value,

${w_{i} = \frac{h_{i}^{*}}{\sqrt{\sum\limits_{i = 1}^{N}{h_{i}}^{2}}}},$ for each received symbol, y_(i). These weights may be computed by combiner/preprocessor 800. Therefore, the combined received symbol may be equal to:

$\begin{matrix} \begin{matrix} {\overset{\sim}{y} = {\sum\limits_{i = 1}^{N}\frac{h_{i}^{*}y_{i}}{\sqrt{\sum\limits_{i = 1}^{N}{h_{i}}^{2}}}}} \\ {= {{\sqrt{\sum\limits_{i = 1}^{N}{h_{i}}^{2}}x} + {\overset{\sim}{n}(4)}}} \\ {= {{\overset{\sim}{h}x} + {\overset{\sim}{n}(5)}}} \end{matrix} & (3) \end{matrix}$ where

$\overset{\sim}{h} = \sqrt{\sum\limits_{i = 1}^{N}{h_{i}}^{2}}$ and

$\overset{\sim}{n} = {\sum\limits_{i = 1}^{N}{\frac{h_{i}^{*}n_{i}}{\sqrt{\sum\limits_{i = 1}^{N}{h_{i}}^{2}}}.}}$ Note that noise component ñ in the combined received symbol is Gaussian, because a weighted sum of Gaussian variables is still Gaussian. Furthermore, the weights for MRC are chosen such that the noise has unit variance. Therefore, a noise whitening filter, such as signal processor 712 in FIG. 7, is not needed. As shown in equation (5), the combined symbol, {tilde over (y)}, may be treated as an individually received signal vector affected by channel {tilde over (h)} and Gaussian noise ñ.

Therefore, following combining, ML decoder 804 may decode the combined symbol as if it was a single received symbol. ML decoder 804 may calculate a log-likelihood ratio (LLR) for each bit of the common transmit sequence. An LLR is a soft-bit metric often associated with maximum-likelihood decoding. For a received symbol y containing a bit corresponding to transmitted bit b_(k), where y is received from a channel with response h, the LLR for bit b_(k) may be defined as

${\ln\left( \frac{\Pr\left\{ {{b_{k} = {1❘y}},h} \right\}}{\Pr\left\{ {{b_{k} = {0❘y}},h} \right\}} \right)}.$ Because {tilde over (y)} may be treated as a single received symbol, the LLR calculation may be expressed as

${\ln\left( \frac{\Pr\left\{ {{b_{k} = {1❘\overset{\sim}{y}}},\overset{\sim}{h}} \right\}}{\Pr\left\{ {{b_{k} = {0❘\overset{\sim}{y}}},\overset{\sim}{h}} \right\}} \right)}.$ The sign of the LLR indicates the most likely value of the transmitted bit (1 if positive, 0 if negative), and the magnitude of the LLR indicates the strength or confidence of the decision. Thus, ML decoder 804 may output soft information in the form of an LLR for each bit. Alternatively, ML decoder 804 may map the LLR to a hard decision, and output a binary estimate of the transmitted sequence, or may provide the LLR to a soft decoder. To calculate the LLR for a bit, b_(k), of the common transmitted symbol ML decoder may implement:

$\begin{matrix} {{{LLR}_{SLC} = {{\min\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{{\overset{\sim}{y} - {\overset{\sim}{h}{\hat{x}}^{(0)}}}}^{2}} - {\min\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{{\overset{\sim}{y} - {\hat{h}{\hat{x}}^{(1)}}}}^{2}}}},} & (6) \end{matrix}$ which will be derived below in equations (7) through (12). The variable X_(λ) ^((j)) in equation (6) denotes a subset of the signal constellation set whose λ^(th) bit equals j for j=0,1. For example, FIGS. 10A and 10B illustrate the four possible subsets for a 4-QAM signal constellation set. 4-QAM is discussed in greater detail above in connection with FIG. 4A. In each figure, the λ^(th) bit is underlined for emphasis. Note that, as is consistent with the definition of the subset, the emphasized bit is the same for all members of a subset. Thus, the signal point in quadrant A belongs in subsets X₀ ⁽⁰⁾ and X₁ ⁽⁰⁾. Similarly, the signal point in quadrant B belongs in subsets X₀ ⁽¹⁾ and X₁ ⁽⁰⁾, etc.

Equation (6), symbol-level combining LLR equation, may be calculated as follows:

$\begin{matrix} \begin{matrix} {{LLR}_{SLC} = {L\left( {{b_{k}❘\overset{\_}{y}},\overset{\_}{h}} \right)}} \\ {= {\ln\frac{\Pr\left\{ {{b_{k} = {1❘\hat{y}}},\overset{\sim}{h}} \right\}}{\Pr\left\{ {{b_{k} = {0❘\overset{\sim}{y}}},\overset{\sim}{h}} \right\}}(8)}} \\ {= {\ln\frac{\Pr\left\{ {{{\overset{\sim}{y}❘b_{k}} = 1},h} \right\}}{\Pr\left\{ {{{\overset{\sim}{y}❘b_{k}} = 0},\overset{\sim}{h}} \right\}}(9)}} \\ {= {\ln\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\Pr\left\{ {{\overset{\sim}{y}❘{\hat{x}}^{(1)}},\overset{\sim}{h}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\Pr\left\{ {{\overset{\sim}{y}❘{\hat{x}}^{(0)}},\overset{\sim}{h}} \right\}}}(10)}} \\ {\simeq {\ln\frac{\max_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\Pr\left\{ {{\overset{\_}{y}❘{\hat{x}}^{(1)}},\overset{\sim}{h}} \right\}}}{\max_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\Pr\left\{ {{\overset{\sim}{y}❘{\hat{x}}^{(0)}},\overset{\sim}{h}} \right\}}}(11)}} \\ {{= {{\min\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{{\overset{\sim}{y} - {\overset{\sim}{h}{\hat{x}}^{(0)}}}}^{2}} - {\min\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{{\overset{\sim}{y} - {\overset{\sim}{h}{\hat{x}}^{(1)}}}}^{2}}}},(12)} \end{matrix} & (7) \end{matrix}$ Equations (7) and (8) follow from the definition of the LLR as previously described. Equation (9) is reached by applying Bayes' Theorem, a technique known in the art, to equation (8). Then, equation (10) shows equation (9) written in terms of transmitted symbols, {circumflex over (x)}, instead of transmitted bits, b_(k). For example, in the numerator of equation (9), the probability that b₀=1 is the sum of the probabilities that the transmitted symbol was “01” or “11” for a 4-QAM system. As shown in FIG. 10A, “01” and “11” is subset X₀ ⁽¹⁾. Therefore, Pr{{tilde over (y)}|b₀=1,{tilde over (h)}} is equivalent to Σ_(x) ₍₁₎ _(εX) ₀ ₍₁₎Pr{{tilde over (y)}|{circumflex over (x)}⁽¹⁾,{tilde over (h)}}. Finally, equation (11) utilizes the approximation, Σ_(i) log a_(i)≈log max_(i)a_(i), and equation (12) results from plugging in equation (2) for the condition probabilities. Recall that equation (2) is the conditional probability distribution function (PDF) for an AWGN channel.

The receiver for a SISO system shown in FIG. 8 with MRC is referred to as an optimal receiver scheme for decoding a signal vector. An optimal receiver scheme is hereinafter defined to be one that, given the N received signal vectors, chooses the signal vector that has the highest probability of being the actual transmit signal vector in the presence of AWGN. This is considered optimum, because all information from the N received signals is used fully. Mathematically, an optimum decoding scheme chooses the signal vector, {circumflex over (x)}, that maximizes Pr{{circumflex over (x)}|y ₁ , . . . , y _(N) ,h ₁ , . . . , h _(N)}.  (13)

A decoder that maximizes equation (13) is a maximum-likelihood decoder. Thus, such a decoder may compute an associated LLR for each bit, which is referred to herein as an optimum LLR, or LLR_(opt).

LLR_(opt) may be derived as follows:

$\begin{matrix} \begin{matrix} {{LLR}_{opt} = {L\left( {{b_{k}❘y_{1}},\ldots\mspace{14mu},y_{N},h_{1},\ldots\mspace{14mu},h_{N}} \right)}} \\ {= {\ln\;\frac{\Pr\left\{ {{b_{k} = {1❘y_{1}}},\ldots\mspace{14mu},y_{N},h_{1},\ldots\mspace{14mu},h_{N}} \right\}}{\Pr\left\{ {{b_{k} = {0❘y_{1}}},\ldots\mspace{14mu},y_{N},h_{1},\ldots\mspace{14mu},h_{N}} \right\}}(15)}} \\ {= {\ln\;\frac{\Pr\left\{ {y_{1},{{{\ldots\mspace{14mu} y_{N}}❘b_{k}} = 1},h_{1},\ldots\mspace{14mu},h_{N}} \right\}}{\Pr\left\{ {y_{1},{{{\ldots\mspace{14mu} y_{N}}❘b_{k}} = 0},h_{1},\ldots\mspace{14mu},h_{N}} \right\}}(16)}} \\ {= {\ln\;\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\Pr\left\{ {y_{1},\ldots\mspace{14mu},{y_{N}❘{\hat{x}}^{(1)}},h_{1},\ldots\mspace{14mu},h_{N}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\Pr\left\{ {y_{1},\ldots\mspace{14mu},{y_{N}❘{\hat{x}}^{(0)}},h_{1},\ldots\mspace{14mu},h_{N}} \right\}}}(17)}} \\ {= {\ln\;\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\prod\limits_{i = 1}^{N}{\Pr\left\{ {{y_{i}❘{\hat{x}}^{(1)}},h_{i}} \right\}}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\prod\limits_{i = 1}^{N}{\Pr\left\{ {{y_{i}❘{\hat{x}}^{(0)}},h_{i}} \right\}}}}(18)}} \\ {\simeq {\ln\;\frac{\max_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\prod\limits_{i = 1}^{N}{\Pr\left\{ {{y_{i}❘{\hat{x}}^{(1)}},h_{i}} \right\}}}}{\max_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\prod\limits_{i = 1}^{N}{\Pr\left\{ {{y_{i}❘{\hat{x}}^{(0)}},h_{i}} \right\}}}}(19)}} \\ {= {{\min\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}\left\{ {\sum\limits_{i = 1}^{N}{{y_{i} - {h_{i}{\hat{x}}^{(0)}}}}^{2}} \right\}} - (20)}} \\ {\min\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}\left\{ {\sum\limits_{i = 1}^{N}{{y_{i} - {h_{i}{\hat{x}}^{(1)}}}}^{2}} \right\}} \end{matrix} & (14) \end{matrix}$ Equation (14) and (15) follow from the definition of the log-likelihood ratio. Most of the remaining equations are derived through substantially the same process as equations (7) through (12). Equation (18) follows from the statistical independence between each received signal vector. Thus, for independent received symbols y₁ and y₂, Pr(y₁, y₂)=Pr(y₁)Pr(y₂), as shown in equation (18).

Although the LLR determined by the symbol-level combining receiver (equation (12)) does not appear to be equal to the optimal LLR (equation (20)), the difference arises due to the Σ_(i) log a_(i)≈log max_(i)a_(i) approximation. Before the Σ_(i) log a_(i)≈log max_(i)a_(i) approximation, it may be shown that the MRC-based symbol-level-combining scheme of FIG. 8 produces an optimal receiver. Recall that Equation (10) is the equation for calculating an LLR in the MRC-based symbol-level combining scheme of FIG. 8 prior to applying the approximation. Equation (10) is reproduced below as equation (21). Thus, the following sequence of equations shows that the LLR produced by symbol-level combining is equivalent to the optimal LLR.

$\begin{matrix} \begin{matrix} {{LLR}_{SLC} = {\ln\;\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\Pr\left\{ {{\overset{\sim}{y}❘{\hat{x}}^{(1)}},\hat{h}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\Pr\left\{ {{\overset{\sim}{y}❘{\hat{x}}^{(0)}},\hat{h}} \right\}}}}} \\ {= {\ln\;\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\frac{1}{\pi}\exp\left\{ {- {{\overset{\sim}{y} - {\overset{\sim}{h}{\hat{x}}^{(1)}}}}^{2}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\frac{1}{\pi}\exp\left\{ {- {{\overset{\sim}{y} - {\overset{\sim}{h}{\hat{x}}^{(0)}}}}^{2}} \right\}}}(22)}} \\ {= {\ln\;\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\exp\left\{ {{- {\overset{\sim}{y}}^{2}} + {2\Re\left\{ {{\overset{\_}{y}}^{*}\overset{\sim}{h}{\hat{x}}^{(1)}} \right\}} - {{\hat{x}}^{(1)}}^{2}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\exp\left\{ {{- {\overset{\sim}{y}}^{2}} + {2\Re\left\{ {{\overset{\sim}{y}}^{*}\hat{h}{\hat{x}}^{(0)}} \right\}} - {{\hat{x}}^{(0)}}^{2}} \right\}}}(23)}} \\ {= {\ln\;\frac{\begin{matrix} {\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\exp\left\{ {{2\Re\left\{ {\left( {\sum\limits_{i = 1}^{N}{y_{i}^{*}h_{i}}} \right){\hat{x}}^{(1)}} \right\}} -} \right.}} \\ \left. {\left( {\sum\limits_{i = 1}^{N}{h_{i}}^{2}} \right){{\hat{x}}^{(1)}}^{2}} \right\} \end{matrix}}{\begin{matrix} {\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\exp\left\{ {{2\Re\left\{ {\left( {\sum\limits_{i = 1}^{N}{y_{i}^{*}h_{i}}} \right){\hat{x}}^{(0)}} \right\}} -} \right.}} \\ \left. {\left( {\sum\limits_{i = 1}^{N}{h_{i}}^{2}} \right){{\hat{x}}^{(0)}}^{2}} \right\} \end{matrix}}\mspace{20mu}(24)}} \\ {= {\ln\;\frac{\begin{matrix} {\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\exp\left\{ {{- \left( {\sum\limits_{i = 1}^{N}{y_{i}}^{2}} \right)} +} \right.}} \\ {{2\Re\left\{ {\left( {\sum\limits_{i = 1}^{N}{y_{i}^{*}h_{i}}} \right){\hat{x}}^{(1)}} \right\}} -} \\ \left. {\left( {\sum\limits_{i = 1}^{N}{h_{i}}^{2}} \right){{\hat{x}}^{(1)}}^{2}} \right\} \end{matrix}}{\begin{matrix} {\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\exp\left\{ {{- \left( {\sum\limits_{i = 1}^{N}{y_{i}}^{2}} \right)} +} \right.}} \\ \left. {{2\Re\left\{ {\left( {\sum\limits_{i = 1}^{N}{y_{i}^{*}h_{i}}} \right){\hat{x}}^{(0)}} \right\}} - {\left( {\sum\limits_{i = 1}^{N}{h_{i}}^{2}} \right){{\hat{x}}^{(0)}}^{2}}} \right\} \end{matrix}}(25)}} \\ {= {\ln\;\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\exp\left\{ {- {\sum\limits_{i = 1}^{N}{{y_{i} - {h_{i}{\hat{x}}^{(1)}}}}^{2}}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\exp\left\{ {- {\sum\limits_{i = 1}^{N}{{y_{i} - {h_{i}{\hat{x}}^{(0)}}}}^{2}}} \right\}}}(26)}} \\ {= {\ln\;\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\prod\limits_{i = 1}^{N}{\frac{1}{\pi}\exp\left\{ {- {{y_{i} - {h_{i}{\hat{x}}^{(1)}}}}^{2}} \right\}}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\prod\limits_{i = 1}^{N}{\exp\left\{ {- {{y_{i} - {h_{i}{\hat{x}}^{(0)}}}}^{2}} \right\}}}}(27)}} \\ {= {\ln\;\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\prod\limits_{i = 1}^{N}{\Pr\left\{ {{y_{i}❘{\hat{x}}^{(1)}},h_{i}} \right\}}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\prod\limits_{i = 1}^{N}{\Pr\left\{ {{y_{i}❘{\overset{\sim}{x}}^{(0)}},h_{i}} \right\}}}}(28)}} \\ {= {{LLR}_{opt}(29)}} \end{matrix} & (21) \end{matrix}$ Equation (22) follows from equation (21) by plugging in the PDF for an AWGN channel shown in equation (2). The remaining equations follow from mathematical manipulation. Equation (28) is the same as equation (18), which was shown above to be equal to the optimal LLR. Therefore, the decoding scheme used by the receiver in FIG. 8 is an optimal decoding scheme for signals received from an AWGN channel. Even if the receiver implements equation (6), which utilizes the Σ_(i) log a_(i)≈log max_(i)a_(i) approximation, the decoding results of the receiver may still be near-optimal.

FIG. 11 shows an illustrative block diagram for a symbol-level combining receiver in a MIMO system. Also, FIG. 11 is a detailed view of one embodiment of the receiver configuration shown in FIG. 7. Combiner 1102 may combine the N received signal vectors by weighted addition. In one embodiment of the present invention, the resulting combined received signal vector may be:

$\begin{matrix} {\overset{\sim}{y} = {{H_{1}^{*}y_{1}} + {H_{2}^{*}y_{2}} + {\ldots\mspace{14mu} H_{N}^{*}y_{N}}}} & (30) \\ {\mspace{14mu}{= {{\left( {{H_{1}^{*}H_{1}} + {H_{2}^{*}H_{2}} + {\ldots\mspace{14mu} H_{N}^{*}H_{N}}} \right)x} + \overset{\sim}{n}}}} & (31) \\ {\mspace{14mu}{= {{\overset{\sim}{H}x} + \overset{\sim}{n}}}} & (32) \end{matrix}$ where

$\overset{\sim}{H} = {{\sum\limits_{i = 1}^{N}{H_{i}^{*}H_{i}\mspace{14mu}{and}\mspace{14mu}\overset{\sim}{n}}} = {\sum\limits_{i = 1}^{N}{H_{i}^{*}{n_{i}.}}}}$ {tilde over (H)} is an N_(t)×N_(t) matrix referred to hereinafter as the combined channel matrix, and may be calculated by combiner/preprocessor 1100. ñ is an N_(t)×1 noise vector hereinafter referred to as the combined noise vector. Here, the weights in equations (30) and (31) are chosen to maximize the SNR. Although the term, maximal ratio combining (MRC), is typically used for SISO systems, it will also be used herein to refer to a symbol-level, MIMO combining scheme that maximizes the SNR. Therefore, the embodiment described here can be referred to as an MRC MIMO receiver. Following the combination, equation (32) shows that the combined received signal vector may be modeled as a single received vector, {tilde over (y)}, affected by channel {tilde over (H)} and noise components ñ. Thus, the combined received signal vector may be decoded in a similar manner as any other received signal vector.

However, the covariance of the combined noise vector, ñ, may easily be shown to equal {tilde over (H)}. Therefore, the noise is not white, because it is well known that white noise has a diagonal covariance matrix. Thus, to whiten the noise components, the combined received signal is processed by signal processor 1112. Signal processor 1112 may whiten the noise by multiplying the signal by {tilde over (H)}^(−1/2), where a matrix A^(1/2) is defined to be any matrix where A^(1/2)A^(1/2)=A. The value of {tilde over (H)}^(−1/2) may be obtained from combiner/preprocessor 1100. Following the multiplication, the processed signal, y′, may be equal to:

$\begin{matrix} {y_{N}^{\prime} = {{\overset{\sim}{H}}_{N}^{- \frac{1}{2}}{\overset{\sim}{y}}_{N}}} & (33) \\ {\mspace{31mu}{= {{{\overset{\sim}{H}}_{N}^{\frac{1}{2}}x} + n_{N}^{\prime}}}} & (34) \end{matrix}$ where the covariance of the processed noise vector, n′, is E[n′_(N)n′*_(N)]=I_(N) _(t) , as desired. Therefore, the processed combined signal vector, y′, may be modeled as a single received signal vector affected by an AWGN channel, where the channel response matrix is {tilde over (H)}^(1/2) and the noise vector is n′.

The filtered signal, y′, may then be decoded by ML decoder 1104. The ML decoder may calculate the log-likelihood ratio by implementing the equation,

$\begin{matrix} {{{LLR}_{SLC} = {{\min\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{{y^{\prime} - {{\overset{\sim}{H}}^{\frac{1}{2}}{\hat{x}}^{(0)}}}}^{2}} - {\min\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{{y^{\prime} - {{\overset{\sim}{H}}^{\frac{1}{2}}{\hat{x}}^{(1)}}}}^{2}}}},} & (35) \end{matrix}$ Equation (35) may be derived as follows:

$\begin{matrix} \begin{matrix} {{LLR}_{SLC} = {L\left( {\left. b_{k} \middle| y^{\prime} \right.,{\overset{\sim}{H}}^{1/2}} \right)}} \\ {= {\ln\frac{\Pr\left\{ {{b_{k} = \left. 1 \middle| y^{\prime} \right.},{\overset{\sim}{H}}^{1/2}} \right\}}{\Pr\left\{ {{b_{k} = \left. 0 \middle| y^{\prime} \right.},{\overset{\sim}{H}}^{1/2}} \right\}}(37)}} \\ {= {\ln\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\Pr\left\{ {\left. y^{\prime} \middle| {\hat{x}}^{(1)} \right.,{\overset{\sim}{H}}^{1/2}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\Pr\left\{ {\left. y^{\prime} \middle| {\hat{x}}^{(0)} \right.,{\overset{\sim}{H}}^{1/2}} \right\}}}(38)}} \\ {\simeq {\ln\frac{\max\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\Pr\left\{ {\left. y^{\prime} \middle| {\hat{x}}^{(1)} \right.,{\overset{\sim}{H}}^{1/2}} \right\}}}{\max\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\Pr\left\{ {\left. y^{\prime} \middle| {\hat{x}}^{(0)} \right.,{\overset{\sim}{H}}^{1/2}} \right\}}}(39)}} \\ {= {{\min\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{{y^{\prime} - {{\overset{\sim}{H}}^{1/2}{\hat{x}}^{(0)}}}}^{2}} - (40)}} \\ {{\min\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{{y^{\prime} - {{\overset{\sim}{H}}^{1/2}{\hat{x}}^{(1)}}}}^{2}},} \end{matrix} & (36) \end{matrix}$ Equations (36) and (37) follow from the definition of the LLR. The remaining equations may be derived through substantially the same process as the process used to obtain equations (7) through (12). ML decoder 1104 may output the LLRs directly as soft information or may convert the LLRs to another soft-bit metric. Alternatively, ML decoder 1104 may map the LLRs to hard decisions, and output a binary sequence estimate of the transmitted sequence, or may output the LLRs to a soft decoder.

It may be shown that the MRC-based symbol-level combining scheme shown in FIG. 11 is an optimal decoding scheme. An optimal LLR for a MIMO system may be calculated as follows:

$\begin{matrix} \begin{matrix} {{LLR}_{opt} = {L\left( {\left. b_{k} \middle| y_{1} \right.,\ldots\mspace{14mu},y_{N},H_{1},\ldots\mspace{14mu},H_{N}} \right)}} \\ {= {\ln\frac{\Pr\left\{ {{b_{k} = \left. 1 \middle| y_{1} \right.},\ldots\mspace{14mu},y_{N},H_{1},\ldots\mspace{14mu},H_{N}} \right\}}{\Pr\left\{ {{b_{k} = \left. 0 \middle| y_{1} \right.},\ldots\mspace{14mu},y_{N},H_{1},\ldots\mspace{14mu},H_{N}} \right\}}(42)}} \\ {= {\ln\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\Pr\left\{ {y_{1},\ldots\mspace{14mu},\left. y_{N} \middle| {\hat{x}}^{(1)} \right.,H_{1},\ldots\mspace{14mu},H_{N}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\Pr\left\{ {y_{1},\ldots\mspace{14mu},\left. y_{N} \middle| {\hat{x}}^{(0)} \right.,H_{1},\ldots\mspace{14mu},H_{N}} \right\}}}(43)}} \\ {= {\ln\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\prod\limits_{i = 1}^{N}\;{\Pr\left\{ {\left. y_{i} \middle| {\hat{x}}^{(1)} \right.,H_{i}} \right\}}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\prod\limits_{i = 1}^{N}\;{\Pr\left\{ {\left. y_{i} \middle| {\hat{x}}^{(0)} \right.,H_{i}} \right\}}}}(44)}} \\ {\simeq {\ln\frac{\max\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}{\prod\limits_{i = 1}^{N}\;{\Pr\left\{ {\left. y_{i} \middle| {\hat{x}}^{(1)} \right.,H_{i}} \right\}}}}{\max\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}{\prod\limits_{i = 1}^{N}\;{\Pr\left\{ {\left. y_{i} \middle| {\hat{x}}^{(0)} \right.,H_{i}} \right\}}}}(45)}} \\ {= {{\min\limits_{{\hat{x}}^{(0)} \in X_{k}^{(0)}}\left\{ {\sum\limits_{i = 1}^{N}{{y_{i} - {H_{i}{\hat{x}}^{(0)}}}}^{2}} \right\}} - (46)}} \\ {\min\limits_{{\hat{x}}^{(1)} \in X_{k}^{(1)}}\left\{ {\sum\limits_{i = 1}^{N}{{y_{i} - {H_{i}{\hat{x}}^{(1)}}}}^{2}} \right\}} \end{matrix} & (41) \end{matrix}$ Equation (41) and (42) follow from the definition of the log-likelihood ratio. The remaining equations are derived through substantially the same process as equations (7) through (12) or equations (14) through (20).

Although the LLR determined by the symbol-level combining receiver (equation (40)) does not appear to be equal to the optimal LLR (equation (46)), the difference arises due to the Σ_(i) log a_(i)≈log max_(i)a_(i) approximation. Before the approximation, it may be shown that the MRC-based symbol-level-combining scheme produces an optimal receiver. The following sequence of equations shows that the LLR produced by symbol-level combining is equivalent to the optimal LLR.

$\begin{matrix} \begin{matrix} {{LLR}_{SLC} = {\ln\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{\lambda}^{(1)}}{\frac{1}{\pi\; N_{r}}\exp\left\{ {- {{y^{\prime} - {{\overset{\sim}{H}}^{1/2}{\hat{x}}^{(1)}}}}^{2}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{\lambda}^{(0)}}{\frac{1}{\pi\; N_{r}}\exp\left\{ {- {{y^{\prime} - {{\overset{\sim}{H}}^{1/2}{\hat{x}}^{(0)}}}}^{2}} \right\}}}}} \\ {= {\ln\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{\lambda}^{(1)}}{\exp\left\{ {{- \left( {{\overset{\sim}{y}}^{*} - {{\hat{x}}^{{(1)}*}{\overset{\sim}{H}}^{*}}} \right)}{{\overset{\sim}{H}}^{- 1}\left( {\overset{\sim}{y} - {\overset{\sim}{H}\;{\hat{x}}^{(1)}}} \right)}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{\lambda}^{(0)}}{\exp\left\{ {{- \left( {{\overset{\sim}{y}}^{*} - {{\hat{x}}^{{(0)}*}{\overset{\sim}{H}}^{*}}} \right)}{{\overset{\sim}{H}}^{- 1}\left( {\overset{\sim}{y} - {\overset{\sim}{H}\;{\hat{x}}^{(0)}}} \right)}} \right\}}}(48)}} \\ {= {\ln\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{\lambda}^{(1)}}{\exp\left\{ {- \begin{pmatrix} {{{\overset{\sim}{y}}^{*}{\overset{\sim}{H}}^{- 1}\overset{\sim}{y}} - {{\overset{\sim}{y}}^{*}{\hat{x}}^{(1)}} -} \\ {{{\hat{x}}^{{(1)}^{*}}\overset{\sim}{y}} + {{\hat{x}}^{{(1)}*}\overset{\sim}{H}\;{\hat{x}}^{(1)}}} \end{pmatrix}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{\lambda}^{(0)}}{\exp\left\{ {- \begin{pmatrix} {{{\overset{\sim}{y}}^{*}{\overset{\sim}{H}}^{- 1}\overset{\sim}{y}} - {{\overset{\sim}{y}}^{*}{\hat{x}}^{(0)}} -} \\ {{{\hat{x}}^{{(0)}^{*}}\overset{\sim}{y}} + {{\hat{x}}^{{(0)}*}\overset{\sim}{H}\;{\hat{x}}^{(0)}}} \end{pmatrix}} \right\}}}(49)}} \\ {{= {\ln\frac{\begin{matrix} {\sum\limits_{{\hat{x}}^{(1)} \in X_{\lambda}^{(1)}}{\exp\left\{ {- \left( {{\sum\limits_{i = 1}^{N}{y_{i}^{*}y_{i}}} - {\sum\limits_{i = 1}^{N}{y_{i}^{*}H_{i}{\hat{x}}^{(1)}}} -} \right.} \right.}} \\ \left. \left. {{\sum\limits_{i = 1}^{N}{{\hat{x}}^{{(1)}*}H_{i}^{*}y_{i}}} + {\sum\limits_{i = 1}^{N}{{\hat{x}}^{{(1)}*}H_{i}^{*}H_{i}{\hat{x}}^{(1)}}}} \right) \right\} \end{matrix}}{\begin{matrix} {\sum\limits_{{\hat{x}}^{(0)} \in X_{\lambda}^{(0)}}{\exp\left\{ {- \left( {{\sum\limits_{i = 1}^{N}{y_{i}^{*}y_{i}}} - {\sum\limits_{i = 1}^{N}{y_{i}^{*}H_{i}{\hat{x}}^{(0)}}} -} \right.} \right.}} \\ \left. \left. {{\sum\limits_{i = 1}^{N}{{\hat{x}}^{{(0)}*}H_{i}^{*}y_{i}}} + {\sum\limits_{i = 1}^{N}{{\hat{x}}^{{(0)}*}H_{i}^{*}H_{i}{\hat{x}}^{(0)}}}} \right) \right\} \end{matrix}}\mspace{121mu}(50)}}\;} \\ {= {\ln\frac{\sum\limits_{{\hat{x}}^{(1)} \in X_{\lambda}^{(1)}}{\exp\left\{ {- {\sum\limits_{i = 1}^{N}{{y_{i} - {H_{i}{\hat{x}}^{(1)}}}}^{2}}} \right\}}}{\sum\limits_{{\hat{x}}^{(0)} \in X_{\lambda}^{(0)}}{\exp\left\{ {- {\sum\limits_{i = 1}^{N}{{y_{i} - {H_{i}{\hat{x}}^{(0)}}}}^{2}}} \right\}}}(51)}} \\ {= {{LLR}_{opt}(52)}} \end{matrix} & (47) \end{matrix}$ Equation (47) follows from equation (38), the LLR equation for an MRC-based symbol-level combining receiver, by plugging in the PDF for an AWGN channel shown in equation (2). The remaining equations follow from mathematical manipulation. Equation (51) is equivalent to equation (43) for an AWGN channel, which was shown above to be equal to the optimal LLR. Therefore, the decoding scheme used by the receiver in FIG. 11 may be used to implement an optimal decoding scheme for signal vectors received from an AWGN channel. Even if the receiver implements equation (40), which utilizes the Σ_(i) log a_(i)≈log max_(i)a_(i) approximation, the decoding results of the receiver may still be near-optimal.

Note that the expression, ∥y′−{tilde over (H)}^(1/2)x∥², is essentially a distance calculation, and is a significant portion of the equation for calculating an LLR, shown above as equation (35), for a MIMO system. Therefore, the ∥y′−{tilde over (H)}^(1/2)x∥² distance equation, or any other such equation in an LLR equation, is hereinafter referred to as a decoding metric. The decoding metric for ML decoder 1104 may be calculated as follows:

$\begin{matrix} \begin{matrix} {{{y_{N}^{\prime} - {{\overset{\sim}{H}}_{N}^{\frac{1}{2}}x_{N}}}}^{2} = {\left( {y_{N}^{\prime} - {{\overset{\sim}{H}}_{N}^{\frac{1}{2}}x_{N}}} \right)^{*}\left( {y_{N}^{\prime} - {{\overset{\sim}{H}}_{N}^{\frac{1}{2}}x_{N}}} \right)}} \\ {= {\left( {{{\overset{\sim}{y}}_{N}^{*}{\overset{\sim}{H}}_{N}^{- \frac{*}{2}}} - {x_{N}^{*}{\overset{\sim}{H}}_{N}^{\frac{*}{2}}}} \right)^{*}\left( {{{\overset{\sim}{H}}_{N}^{- \frac{1}{2}}{\overset{\sim}{y}}_{N}} - {{\overset{\sim}{H}}_{N}^{\frac{1}{2}}x_{N}}} \right)}} \\ {= {{x_{N}^{*}{\overset{\sim}{H}}_{N}x_{N}} - {2\Re\left\{ {x_{N}^{*}{\overset{\sim}{y}}_{N}} \right\}} + {{\overset{\sim}{y}}_{N}^{*}{\overset{\sim}{H}}_{N}^{- 1}{\overset{\sim}{y}}_{N}}}} \end{matrix} & \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} (53) \\ \; \end{matrix} \\ (54) \end{matrix} \\ \; \end{matrix} \\ (55) \end{matrix} \end{matrix}$ Notice that the last term in equation (55) does not depend on the transmitted signal vector. Therefore, the last term is common to both the numerator and denominator in deriving the LLR (derived above in equations (36) through (39)), and may be ignored in the LLR calculation, or equivalently, the circuit implementation of the calculation.

The receivers illustrated in FIGS. 7, 8, and 11 show all N received vectors and N channel response matrices as inputs into their respective combining blocks. However, all N signal vectors and N channel matrices are not necessarily given to the combiners at the same time, and the receiver is not required to wait until after all N signal vectors are received to begin operating. Instead, the receivers shown in FIGS. 7, 8, and 11 merely illustrate that the system is capable of combining information from all N transmissions of a common transmit signal vector in any suitable manner. In fact, in some embodiments, such as when a HARQ protocol is used, the combiners may only need to accept one signal vector or channel matrix at any given time, and information on the previous transmissions may be obtained from some other source.

FIG. 12 shows a more detailed view of one embodiment of the receiver of FIG. 11 that illustrates how a receiver may operate when N signal vectors are received in groups of P signal vectors, where P≦N. Combiner/preprocessor 1200, combiner 1202, ML decoder 1204, and signal processor 1212 may be substantially the same as and/or have substantially the same functionality as combiner/preprocessor 1100, combiner 1102, ML decoder 1104, and signal processor 1112 of FIG. 11, respectively. The variable P is hereinafter defined to be the number of signal vectors that are received substantially at the same time (e.g., concurrently, within a given amount of time, etc.). Thus, for a HARQ or ARQ protocol, P may be equal to one. For repetition coding or another suitable fixed transmission scheme, P may be equal to N. For other suitable protocols, 1<P<N. For simplicity, it is assumed that P is divisible by N. In this scenario, there are a total of P/N transmissions of P signal vectors. The present invention, however, is not limited to this constrained situation. Also, for clarity, subscripts on any combined vectors or matrices will refer to the number of vectors or matrices included in the combination. For example, {tilde over (y)}_(i) may refer to a combined received signal vector for a combination of received vectors y₁, . . . , y_(i) or y_(i+1) . . . , y_(2i), etc.

When a first set of P signal vectors is received by the system in FIG. 12, no previous information about the common transmit signal vector is available. Therefore, combiners 1200 and 1202 may calculate the combined received vector, {tilde over (y)}_(P), and the combined channel matrix, {tilde over (H)}_(P), for the P signal vectors, respectively. The values of {tilde over (y)}_(P) and {tilde over (H)}_(P) may be stored in storage 1222 and 1220, respectively, for future use. Although storage 1220 and 1222 are shown to be separate in FIG. 12, they may also be a single storage system. Combiner/preprocessor 1200 may additionally calculate {tilde over (H)}_(P) ^(1/2) using {tilde over (H)}_(P). Therefore, ML decoder 1204 may optimally decode for the common transmit signal based on the information available in the P received signal vectors.

When a second set of P signal vectors is received, combiners 1200 and 1202 may combine the newly received signal vectors with the information for the first set of signal vectors stored in storage 1220 and 1222. That is, combiner 1202 may calculate {tilde over (y)}_(P) for the second set of P signal vectors, and may add them to the combined vector that has already been calculated. Similarly, combiner 1200 may calculate {tilde over (H)}_(P) for the second set of P channel matrices, if they are different than the first set, and may add them to the combined channel matrix that has already been calculated. If the channel matrices are the same as for the first transmission, combiner 1200 may simply utilize the information obtained from the previous calculations. Thus, combiners 1200 and 1202 may obtain combined signal vectors and combined channel matrices for the first 2P signal vectors ({tilde over (y)}_(2P) and {tilde over (H)}_(2P)) without re-computing information obtained from previous transmissions. Mathematically, combiners 1200 and 1202 may compute:

$\begin{matrix} {{\overset{\sim}{y}}_{2P} = {{\sum\limits_{i = 1}^{2P}{H_{i}^{*}y_{i}}} = {{\overset{\sim}{y}}_{P} + {\sum\limits_{j = {P + 1}}^{2P}{H_{j}^{*}y_{j}}}}}} & (56) \\ {{\overset{\sim}{H}}_{2P} = {{\sum\limits_{i = 1}^{2P}{H_{i}^{*}H_{i}}} = {{\overset{\sim}{H}}_{P} + {\sum\limits_{j = {P + 1}}^{2P}{H_{j}^{*}{H_{j}.}}}}}} & (57) \end{matrix}$ {tilde over (y)}_(2P) and {tilde over (H)}_(2P) may be stored in storage 1222 and 1220, respectively, by overwriting {tilde over (y)}_(P) and {tilde over (H)}_(P) that was stored after the first transmission. {tilde over (y)}_(2P) and {tilde over (H)}_(2P) may then be utilized when a third set of P signal vectors are received.

Using the storage systems shown in FIG. 12, a receiver may incrementally change its combined received vector and combined channel matrix as new sets of signal vectors are received. After each set of P signal vectors is received, ML decoder 1204 produces an optimal estimate of the common transmit signal vector for the given number of signal vectors that have been received. Thus, the effectiveness of the receiver does not depend on the number of received vectors. This is particularly useful for certain transmission protocols, such as HARQ, where the number of received signal vectors may vary.

Another benefit illustrated by the receiver configuration in FIG. 12, and may be true of any of the other embodiments of the present invention (e.g., FIGS. 7, 11, 13, and 14), is decoder reusability for arbitrary N. That is, only one decoder is implemented no matter how many signal vectors are received. Using a separate decoder for each possible value of N would drastically increase both the amount and complexity of the hardware. In addition, since it would be impractical and impossible to implement a different decoder for all N≧1, the decoding flexibility of the receiver would be limited. Therefore, it may be highly beneficial, in terms of decoder complexity and flexibility, that the receiver configurations shown in FIGS. 7, 11, 12, 13, and 14 may implement a single decoder for arbitrary N.

Another benefit of the receiver configuration in FIG. 12 is memory efficiency. After each set of P signal vectors is received, a new combined signal vector, {tilde over (y)}, is calculated. This signal vector may replace the previous information stored in memory. Therefore, the memory requirement of storage 1220 and 1222 does not depend on the number of received vectors. In particular, storage 1200 may be just large enough to store one copy of {tilde over (H)}, and storage 1202 may be just large enough to store one copy of {tilde over (y)}. This is in contrast to a system that re-computes {tilde over (y)} and {tilde over (H)} each time a new set of vectors is received. In this scenario, the receiver would need to save the signal vectors and channel response matrices for all previous transmissions.

Referring now to FIGS. 13 and 14, other detailed embodiments of FIG. 7 for a symbol-level combining receiver are shown. These embodiments utilize additional signal processing techniques that may be used to reduce the calculation complexity of the ML decoder. Storage systems, such as storage 1220 and 1222, are not expressly shown in FIGS. 13 and 14, but may be assumed to be part of their corresponding combiners.

FIG. 13 shows a symbol-level combining receiver that utilizes QR decomposition to reduce the complexity of calculating the ML decoding metric. In addition to combining channel response matrices and determining {tilde over (H)}^(1/2), combiner/preprocessor 1300 may also factor {tilde over (H)}^(1/2) into a matrix with orthonormal columns, Q, and a square, upper-triangular matrix R. Therefore, {tilde over (H)}^(1/2)=QR and {tilde over (H)}^(−1/2)=R⁻¹Q*. Accordingly, the processed combined received signal vector, y′ _(N) ={tilde over (H)} _(N) ^(−1/2) {tilde over (y)} _(N),  (32) computed by signal processor 1312 may be expressed as,

$\begin{matrix} {y_{N}^{\prime} = {{{\overset{\sim}{H}}_{N}^{1/2}x} + n_{N}^{\prime}}} & (59) \\ {\mspace{34mu}{{= {{QRx} + n_{N}^{\prime}}},}} & (60) \end{matrix}$ where the covariance of the noise is E[n′_(N)n′*_(N)]=I_(N) _(t) . Signal processor 1312 may additionally process y′_(N) by multiplying it by Q*. This operation yields,

$\begin{matrix} {{Q^{*}y_{N}^{\prime}} = {Q^{*}R^{- 1}Q^{*}{\overset{\sim}{y}}_{N}}} & (61) \\ {\mspace{59mu}{= {{Rx} + {Q^{*}n_{N}^{\prime}}}}} & (62) \end{matrix}$ Therefore, because Q* is orthonormal and deterministic, the covariance of Q*n′_(N) is still the identity matrix. Thus, Q*y′_(N) may be treated as a single received signal vector affected by channel R and white noise Q*n′_(N).

After signal processor 1312 in FIG. 13 processes y′, itself computed by signal processor 1312 from the combined received signal vector provided by combiner 1302, decoder 1304 may decode the result using channel information 1308 provided by channel preprocessor 1300. The decoding metric for the processed signal may be ∥Q*y′_(N)−Rx∥², or ∥Q*R⁻¹Q*{tilde over (y)}_(N)−Rx∥². Because R is an upper-triangular matrix, the complexity of the decoding metric may be reduced compared to the complexity of the decoding metric implemented by ML decoder 1204 in FIG. 12.

Referring now to FIG. 14, the illustrated receiver utilizes Cholesky factorization to reduce the complexity of calculating the ML decoding metric. After combiner 1400 generates a combined channel matrix, {tilde over (H)}, the combiner may factor the combined matrix using a Cholesky factorization. The Cholesky factorization factors a square matrix into a lower triangular matrix, L, and its conjugate transpose, L*. Thus, the combined channel matrix may be written as: {tilde over (H)} _(N) =LL*  (63) Therefore, combined received signal vector, {tilde over (y)} _(N) ={tilde over (H)} _(N) x+ñ _(N),  (64) from combiner 1402 may be expressed as, {tilde over (y)} _(N) =LL*x+ñ _(N).  (65) However, the covariance of the combined noise vector, ñ, is equal to {tilde over (H)}. Therefore, the noise is not white, and thus not as easily decodable. To whiten the noise, the combined received vector, {tilde over (y)}, may be passed through signal processor 1412. Signal processor 1412 may multiply {tilde over (y)} by the inverse of L, or L⁻¹, obtained from preprocessor 1400. This produces a processed signal vector,

$\begin{matrix} {y_{N}^{\prime} = {{L^{- 1}{\overset{\sim}{y}}_{N}} = {{L^{- 1}{LL}^{*}x} + {L^{- 1}{\overset{\sim}{n}}_{N}}}}} & (66) \\ {\mspace{130mu}{{= {{L^{*}x} + {\overset{\sim}{n}}_{N}^{\prime}}},}} & (67) \end{matrix}$ where ñ′_(N)=L⁻¹ñ_(N). The new noise component, n′_(N), is white, because E[ñ′_(N)ñ′*_(N)]=I_(N) _(t) . Therefore, y′_(N) may be treated as a single received signal affected by channel L* and white noise n′_(N), and decoded as such.

Therefore, after signal processor 1412 in FIG. 14 produces y′, decoder 1404 may decode y′ using channel information 1408 provided by channel preprocessor 1400. The decoding metric for the processed signal may be ∥L⁻¹{tilde over (y)}_(N)−L*x∥². Because L* is an upper-triangular matrix, the complexity of the decoding metric may be reduced compared to the complexity of the decoding metric implemented by ML decoder 1204 in FIG. 12.

More detailed embodiments of preprocessor 1400, signal processor 1412, and decoder 1404 (FIG. 14) will be described below in connection with FIGS. 15-17, and equations (68) through (120). In particular, FIGS. 15 and 16 and equations (76) through (98) describe how various components in FIG. 14 may be implemented for a 2-input, 2-output MIMO system. FIGS. 15 and 17 and equations (99) through (120) describe how various components in FIG. 14 may be implemented for a 3-input, 3-output MIMO system. Although only 2-input, 2-output and 3-input, 3-output examples are given, it should be understood that the receiver of FIG. 14 may be practiced according to the description below for any R-input, R-output MIMO system.

Preprocessor 1400 may compute the Cholesky factorization of the combined channel matrix, {tilde over (H)}=LL*, using the Cholesky algorithm. The Cholesky algorithm is an R-step recursive algorithm, where R is the number of inputs or outputs in the MIMO system. Thus, the number of calculations performed by the preprocessor increases as the size of the channel matrix grows. At each step, the Cholesky algorithm calculates a matrix A^((i)), where A ^((i)) =L _(i) A ^((i+1)) L* _(i) ,i=1, . . . , R  (68) The recursive algorithm starts with A⁽¹⁾, which is the original matrix, {tilde over (H)}, and ends with A^((R))=L_(R)A^((R+1))L*_(R), where A^((R+1)) is the identity matrix, I_(R×R). Therefore, by plugging in all R equations for A^((i)), the algorithm yields,

$\begin{matrix} {\overset{\sim}{H} = {A^{(R)} = {{L_{1}\left( {L_{2}\mspace{14mu}\ldots\mspace{14mu}\left( {L_{R}A^{({R + 1})}L_{R}^{*}} \right)\mspace{14mu}\ldots\mspace{14mu} L_{2}^{*}} \right)}L_{1}^{*}}}} & (69) \\ {\mspace{20mu}{= {{L_{1}\left( {L_{2}\mspace{14mu}\ldots\mspace{14mu}\left( {L_{R}I_{RxR}L_{R}^{*}} \right)\mspace{14mu}\ldots\mspace{14mu} L_{2}^{*}} \right)}L_{1}^{*}}}} & (70) \\ {\mspace{20mu}{= {\left( {L_{1}L_{2}\mspace{14mu}\ldots\mspace{14mu} L_{R}} \right)\left( {L_{R}^{*}\mspace{14mu}\ldots\mspace{14mu} L_{2}^{*}L_{1}^{*}} \right)}}} & (71) \\ {\mspace{20mu}{= {{LL}^{*}.}}} & (72) \end{matrix}$ The result, as expected, is a decomposition of {tilde over (H)} that produces a lower triangular matrix, L=L₁L₂ . . . L_(R), and its conjugate transpose, L*=L*_(R) . . . L*₂L*₁. At each stage i, the matrix A^((i)) may be written as,

$\begin{matrix} {A^{(i)} = {\begin{bmatrix} I_{i - 1} & 0 & 0 \\ 0 & a^{(i)} & b^{{(i)}*} \\ 0 & b^{(i)} & B^{(i)} \end{bmatrix}.}} & (73) \end{matrix}$ a^((i)) is a single entry in A^((i)), b^((i)) is an (R−i)×1 vector, b^((i))* is the conjugate transpose of b^((i)), and B^((i)) is an (R−i)×(R−i) matrix. Using equation (68) and the variables defined in equation (73), the matrices A^((i+1)), for the next step of the algorithm, and L_(i) may be written as,

$\begin{matrix} {A^{({i + 1})} = {\begin{bmatrix} I_{i - 1} & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & {B^{(i)} - {\frac{1}{a^{(i)}}b^{(i)}b^{{(i)}*}}} \end{bmatrix}\mspace{14mu}{and}}} & (74) \\ {{L_{i} = \begin{bmatrix} I_{i - 1} & 0 & 0 \\ 0 & \sqrt{a^{(i)}} & 0 \\ 0 & {\frac{1}{\sqrt{a^{(i)}}}b^{(i)}} & I_{N - i} \end{bmatrix}}\mspace{11mu}} & (75) \end{matrix}$ Therefore, preprocessor 1400 may successively calculate matrices L₁, . . . , L_(R), and compute L=L₁ . . . L_(R) and its inverse, L⁻¹=L_(R) ⁻¹L_(R−1) ⁻¹ . . . L₁ ⁻¹.

For a 2×2 combined channel matrix,

${\overset{\sim}{H} = {\sum\limits_{i = 1}^{N}{H_{i}^{*}H_{i}}}},$ the matrix components may be represented by h₁₁, h₁₂, h*₁₂, and h₂₂. Thus, the first matrix, A⁽¹⁾, may be given by,

$\begin{matrix} {A^{(1)} = {\overset{\sim}{H} = {\begin{bmatrix} h_{11} & h_{12} \\ h_{12}^{*} & h_{22} \end{bmatrix}.}}} & (76) \end{matrix}$ Note that h₂₁, the first component on the second line is equal to h*₁₂, because

$\overset{\sim}{H} = {{\sum\limits_{i = 1}^{N}{H_{i}^{*}H_{i}}} = {\left( {\sum\limits_{i = 1}^{N}{H_{i}^{*}H_{i}}} \right)^{*} = {{\sum\limits_{i = 1}^{N}{H_{i}H_{i}^{*}}} = {{\overset{\sim}{H}}^{*}.}}}}$ Using the variables of equation (73), A⁽¹⁾ may also be expressed as,

$\begin{matrix} {A^{(1)} = {\begin{bmatrix} h_{11} & h_{12} \\ h_{12}^{*} & h_{22} \end{bmatrix} \equiv {\begin{bmatrix} a^{(1)} & b^{{(1)}*} \\ b^{(1)} & B^{(1)} \end{bmatrix}.}}} & (77) \end{matrix}$ The first step in the recursive algorithm involves determining A⁽²⁾ and L₁ using equations (74) and (75), respectively. Accordingly, A⁽²⁾ and L₁ may be given by,

$\begin{matrix} {A^{(2)} = {\begin{bmatrix} 1 & 0 \\ 0 & {B^{(1)} - {\frac{1}{a^{(1)}}b^{(1)}b^{{(1)}*}}} \end{bmatrix} = {\begin{bmatrix} 1 & 0 \\ 0 & {\frac{1}{h_{11}}h_{11}^{(2)}} \end{bmatrix}\mspace{14mu}{and}}}} & (78) \\ {{L_{1} = {\begin{bmatrix} \sqrt{a^{(1)}} & 0 \\ \frac{b^{(1)}}{\sqrt{a^{(1)}}} & I_{1} \end{bmatrix}\mspace{11mu} = \begin{bmatrix} \sqrt{h_{11}} & 0 \\ \frac{h_{12}^{*}}{\sqrt{h_{11}}} & 1 \end{bmatrix}}},} & (79) \end{matrix}$ where h₁₁ ⁽²⁾h₁₁h₂₂−h*₁₂h₁₂.

After determining A⁽²⁾, the second and final step in the Cholesky algorithm involves calculating L₂ and A⁽³⁾. In accordance with equation (73), A⁽²⁾ may be written as,

$\begin{matrix} {A^{(2)} = {\begin{bmatrix} 1 & 0 \\ 0 & {\frac{1}{h_{11}}h_{11}^{(2)}} \end{bmatrix}\mspace{11mu} \equiv {\begin{bmatrix} I_{1} & 0 \\ 0 & a^{(2)} \end{bmatrix}.}}} & (80) \end{matrix}$ Thus, L₂ and A⁽³⁾ may be expressed as,

$\begin{matrix} {A^{(3)} = {\left\lbrack I_{2} \right\rbrack = {\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\mspace{14mu}{and}}}} & (81) \\ {L_{2} = {\begin{bmatrix} I_{1} & 0 \\ 0 & \sqrt{a^{(2)}} \end{bmatrix} = {\begin{bmatrix} 1 & 0 \\ 0 & \frac{\sqrt{h_{11}^{(2)}}}{\sqrt{h_{11}}} \end{bmatrix}\;.}}} & (82) \end{matrix}$ As expected at the final step of the Cholesky algorithm, the matrix, A⁽³⁾=A^((R+1)), is the identity matrix. Note that there are only two steps in the Cholesky algorithm, because {tilde over (H)} is 2×2.

The lower triangular matrix, L, where {tilde over (H)}=LL*, may be determined following the recursive algorithm described above. In general, L is determined by multiplying L₁, . . . , L_(P). Thus, for the 2×2 case, L may be calculated by multiplying L₁ and L₂, producing,

$\begin{matrix} {L = {{L_{1}L_{2}} = {{\begin{bmatrix} \sqrt{h_{11}} & 0 \\ \frac{h_{12}^{*}}{\sqrt{h_{11}}} & 1 \end{bmatrix}\begin{bmatrix} 1 & 0 \\ 0 & \frac{\sqrt{h_{11}^{(2)}}}{\sqrt{h_{11}}} \end{bmatrix}} = {\begin{bmatrix} \sqrt{h_{11}} & 0 \\ \frac{h_{12}^{*}}{\sqrt{h_{11}}} & \frac{\sqrt{h_{11}^{(2)}}}{\sqrt{h_{11}}} \end{bmatrix}.}}}} & (83) \end{matrix}$ The inverse of L, or L⁻¹, may also be calculated by computing the inverse of both L₁ and L₂, and multiplying them in reverse order. That is,

$\begin{matrix} {L^{- 1} = {{L_{2}^{- 1}L_{1}^{- 1}} = {\begin{bmatrix} \frac{1}{\sqrt{h_{11}}} & 0 \\ \frac{- h_{12}^{*}}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}} & \frac{\sqrt{h_{11}}}{\sqrt{h_{11}^{(2)}}} \end{bmatrix}.}}} & (84) \end{matrix}$

Therefore, using the Cholesky algorithm, a preprocessor (e.g., preprocessor 1400) in a MIMO receiver may compute L and L⁻¹ for a combined channel matrix, {tilde over (H)}. These matrices may be used by a signal processor, such as signal processor 1412, or by a decoder, such as ML decoder 1404. Alternatively, a preprocessor may have the equations for one or more factorizations, or equivalent representations of the equations, hard-coded or hard-wired. For example, the preprocessor may hard-code or hard-wire equations (83) and (84).

L and L⁻¹, as calculated in the Cholesky algorithm described above, may be used by an ML decoder to compute a log-likelihood ratio for each bit in a transmitted sequence. For example, the receiver in FIG. 14 may be configured such that the ML decoder computes LLRs according to,

$\begin{matrix} {{LLR} = {{- {\min\limits_{b = 0}\left\{ {{{L^{- 1}Y} - {L^{*}X}}}^{2} \right\}}} + {\min\limits_{b = 1}\left\{ {{{L^{- 1}Y} - {L^{*}X}}}^{2} \right\}}}} & (85) \end{matrix}$ Thus, the decoding metric for this receiver may be ∥L⁻¹Y−L*X∥². Plugging in L_(2×2) and L_(2×2) ⁻¹ from the Cholesky factorization described above, the metric implemented by the decoder would be,

$\begin{matrix} {D = {{{{\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}\begin{bmatrix} \sqrt{h_{11}^{(2)}} & 0 \\ {- h_{12}^{*}} & h_{11} \end{bmatrix}}Y} - {{\frac{1}{\sqrt{h_{11}}}\begin{bmatrix} h_{11} & 0 \\ h_{12}^{*} & \sqrt{h_{11}^{(2)}} \end{bmatrix}}^{*}X}}}^{2}} & (86) \end{matrix}$ Because L⁻¹Y may be an input into the decoder (e.g., from signal processor 1412 in FIG. 14), the decoder may actually compute,

$\begin{matrix} {{D = {{y^{\prime} - {{\frac{1}{\sqrt{h_{11}}}\begin{bmatrix} h_{11} & 0 \\ h_{12}^{*} & \sqrt{h_{11}^{(2)}} \end{bmatrix}}^{*}X}}}^{2}},} & (87) \end{matrix}$ where y′=L⁻¹Y is the input. To compute the LLR in equation (85), the decoding metric in equation (87) may be repeatedly computed using all the possible combinations of X. In this way, the decoder may determine the combination that produces the minimum values of equation (87) for b=1 and b=0. For a 2×2 64-QAM system, there are 64 possible values for each symbol in X. Therefore, the distance calculation, ∥L⁻¹Y−L*X∥², would be computed 64×64=4096 times.

Note that the decoding metric shown in equation (87) computes square roots, divisions, and multiplications. These may be computationally expensive operations, and may therefore be time intensive and/or complex in implementation. Furthermore, the metric may be computed repeatedly (e.g., 4096 times). Therefore, the effect of the complex/time-intensive computations may be magnified. The part of the calculation that is repeatedly computed is hereinafter referred to as the critical path. Accordingly, a different decoding strategy is provided in the present invention that reduces the complexity of the critical path. In particular, part of intensive calculations may be incorporated into a preprocessor (e.g., preprocessor 1400) or into the computation after the minimizing values of X are determined.

To reduce the complexity of the critical path, the decoding metric shown in equation (85) may be factored as follows:

$\begin{matrix} {D = {\frac{1}{h_{11}h_{11}^{(2)}}{{{{\begin{bmatrix} \sqrt{h_{11}^{(2)}} & 0 \\ {- h_{12}^{*}} & h_{11} \end{bmatrix}Y} - {{\sqrt{h_{11}^{(2)}}\begin{bmatrix} h_{11} & 0 \\ h_{12}^{*} & \sqrt{h_{11}^{(2)}} \end{bmatrix}}^{*}X}}}^{2}.}}} & (88) \end{matrix}$ For simplicity, the factored decoding metric may be written as,

$\begin{matrix} {D = {\frac{1}{h_{11}h_{11}^{(2)}}{{{{\hat{L}}^{- 1}Y} - {\sqrt{h_{11}^{(2)}}{\overset{\sim}{L}}^{*}X}}}^{2}}} & (89) \\ {\mspace{20mu}{{= {\frac{1}{h_{11}h_{11}^{(2)}}\overset{\sim}{D}}},}} & (90) \end{matrix}$ where

${{\hat{L}}^{- 1} = \begin{bmatrix} \sqrt{h_{11}^{(2)}} & 0 \\ {- h_{12}^{*}} & h_{11} \end{bmatrix}},\mspace{14mu}{{\overset{\sim}{L}}^{*} = \begin{bmatrix} h_{11} & 0 \\ h_{12}^{*} & \sqrt{h_{11}^{(2)}} \end{bmatrix}},$ and {tilde over (D)} is a simplified decoding metric. Therefore, the LLR may be expressed as,

$\begin{matrix} {{LLR} = {{- {\min\limits_{b = 0}\left\{ {{{L^{- 1}Y} - {L^{*}X}}}^{2} \right\}}} + {\min\limits_{b = 1}\left\{ {{{L^{- 1}Y} - {L^{*}X}}}^{2} \right\}}}} & (91) \\ {{\mspace{40mu}\;}{= {\frac{1}{h_{11}h_{11}^{(2)}}{\left( {{- {\min\limits_{b = 0}\left\{ \overset{\sim}{D} \right)}} + {\min\limits_{b = 1}\left\{ \overset{\sim}{D} \right\}}} \right).}}}} & (92) \end{matrix}$ Note that the simplified decoding metric may be repeatedly computed rather than the original decoding metric. Thus, the

$\frac{1}{h_{11}h_{11}^{(2)}}$ calculation has been removed from the critical path, which has both a multiplication and division operation. Therefore, the complexity of the calculation in the critical path is reduced, but at the expense of increasing the final LLR calculation. However, fewer LLRs (e.g., 16 LLRs for a 2×2 64-QAM system) are typically calculated than distance calculations (e.g., 4096). Therefore, removing

$\frac{1}{h_{11}h_{11}^{(2)}}$ from the critical path may still provide substantial time and/or complexity savings.

Furthermore, the

$\frac{1}{h_{11}h_{11}^{(2)}}$ term used in the final LLR computation may not be needed until after the critical path calculations are completed. Therefore,

$\frac{1}{h_{11}h_{11}^{(2)}}$ may be computed during the time that the time-intensive critical path calculations are being performed. Therefore, slow, but less complex multiplication and division implementations may be used without increasing the amount of time needed to compute the LLR. For example, the division operation may be implemented using a serial inversion mechanism.

In some embodiments, rather than computing the squared, simplified decoding metric, a linear approximation may be used. For example, the simplified decoding metric may be, {tilde over (D)} _(linear) =∥{circumflex over (L)} ⁻¹ Y−√{square root over (h ₁₁ ⁽²⁾)}{tilde over (L)}*X∥,  (93) which leaves out the squared term in the squared, simplified decoding metric. This approximation may reduce the complexity of the calculation within the critical path, and therefore may result in significant time and/or complexity savings in comparison to the squared version of the distance calculation.

If the linear distance metric in equation (93) is used as the decoding metric, the final LLR calculation may be updated to,

$\begin{matrix} {{LLR} = {\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}\left( {{- {\min\limits_{b = 0}\left\{ \overset{\sim}{D} \right)}} + {\min\left\{ \overset{\sim}{D} \right\}}} \right)}} & (94) \\ {\mspace{50mu}{= {\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}{{LLR}_{linear}.}}}} & (95) \end{matrix}$ Note that the complexity of the critical path has been reduced again at the expense of the complexity of the final LLR calculation. However, because the

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}$ term may be computed while the critical path calculations are computed,

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}$ may be implemented using techniques that may be low-complexity and time-intensive. Furthermore, if

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}$ is implemented in hardware, √{square root over (h₁₁ ⁽²⁾)} and √{square root over (h₁₁)} may be computed using the same square root circuitry, thereby reducing the total amount of hardware.

Another benefit of implementing the linear decoding metric of equation (93) and the LLR of equation (94) is the fact that the computation is symbol-based rather than vector-based. That is, minimizing {tilde over (D)} may involve determining values for all the symbols in X. However, minimizing {tilde over (D)}_(linear) involves determining the minimum value for a single symbol in X. Therefore, a decoder using the linear metric may output results symbol-by-symbol, rather than in groups of symbols. This may be beneficial when hard-decoding is used. Using hard-decoding, LLR_(linear) may also be computed symbol-by-symbol, and may then be directly mapped to a hard decision. Thus, a

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}$ correction term may not be needed. Without having to compute a division operation and an extra square root operation, the complexity of the system may be further reduced considerably.

Referring now to FIG. 15, illustrative flow diagram 1500 is shown for decoding a signal vector based on a decoding metric. The signal vector decoded according to the steps of flow diagram 1500 can be a combined signal vector, such as the combined signal vector produced by combiner 1402 of FIG. 14. At step 1502, channel information can be preprocessed (e.g., by preprocessor 1400 of FIG. 14) for use in evaluating a simplified decoding metric. The simplified decoding metric may be derived from factoring a decoding metric. For example, the decoding metric may be ∥L⁻¹Y−L*X∥², where L* and L⁻¹ are shown above in equation (86). In this case, the simplified decoding metric may be {tilde over (D)}=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ⁽²⁾)}{tilde over (L)}*X∥². The term factored out of the decoding metric may be

$\frac{1}{h_{11}h_{11}^{(2)}},$ which may be referred to as a modifier value or simply a modifier. Alternatively, the simplified decoding metric may be {tilde over (D)}_(linear)=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ⁽²⁾)}{tilde over (L)}*X∥, and the resulting modifier may be,

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}.$ Thus, the simplified decoding metric may be a function of signal vectors and channel characteristics, while the modifier may be a function of only the channel characteristics.

The channel preprocessing performed at step 1502 can reduce the amount or complexity of computation performed in the critical path. That is, channel preprocessing can compute, in advance of the operations in the critical path, any functions of channel information that would otherwise have been computed, possibly repeatedly, in the critical path. The preprocessors can compute any channel functions that are common to each evaluation of the simplified decoding metric for the different values of X. For example, if the simplified decoding metric is {tilde over (D)} or {tilde over (D)}_(linear),

${\hat{L}}^{- 1} = {{\begin{bmatrix} \sqrt{h_{11}^{(2)}} & 0 \\ {- h_{12}^{*}} & h_{11} \end{bmatrix}\mspace{14mu}{and}\mspace{14mu}{\overset{\sim}{L}}^{*}} = \begin{bmatrix} h_{11} & 0 \\ h_{12}^{*} & \sqrt{h_{11}^{(2)}} \end{bmatrix}}$ may be common to each evaluation of the simplified decoding metric. Therefore a channel preprocessor may compute √{square root over (h₁₁ ⁽²⁾)} at step 1502 for use in evaluating the simplified decoding metric, which can also be used to compute the modifier.

With continuing reference to FIG. 15, at step 1504, a soft-bit information metric, such a log-likelihood ratio, can be computed based on the simplified decoding metric. Continuing the examples described above in step 1502, a soft-bit information metric can be computed in the form of an LLR using the simplified decoding metric, {tilde over (D)}, where

$\begin{matrix} {{{LLR}^{\prime} = {{- {\min\limits_{b = 0}\left\{ \overset{\sim}{D} \right\}}} + {\min\limits_{b = 1}\left\{ \overset{\sim}{D} \right\}}}},} & (96) \end{matrix}$ Alternatively, a soft-bit metric can be computed using the linear simplified decoding metric, {tilde over (D)}_(linear), according to,

$\begin{matrix} {{LLR}_{linear}^{\prime} = {{- {\min\limits_{b = 0}\left\{ {\overset{\sim}{D}}_{linear} \right\}}} + {\min\limits_{b = 1}{\left\{ {\overset{\sim}{D}}_{linear} \right\}.}}}} & (97) \end{matrix}$

The modifier can be computed at step 1506 substantially currently (e.g., in parallel) to step 1504. That is, while the simplified decoding metric is repeatedly computed for different possible values of X, the modifier can be computed. For the example described above, step 1506 may involve computing

$\frac{1}{h_{11}h_{11}^{(2)}}.$ In this embodiment, the hardware (in a hardware implementation) can include a multiplier and a divider. Alternatively, step 1506 may involve computing

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}},$ in which case the hardware may additionally include a square root circuit. In some embodiments, some of the resources used to perform step 1504 may also be used to perform operations in step 1506. As described above, because step 1504 may take a relatively long time to complete, any multiplier, divider, or square root circuit for computing step 1506 can be embodied in a slower and lower-complexity implementation.

At step 1508, the soft-bit information metric and the modifier can be combined to produce soft information corresponding to the transmitted digital sequence. The soft-bit information metric and the modifier may be combined by multiplying the two values. In these embodiments, R multipliers may be implemented to multiply the R soft-bit information metric by the modifier to create R final LLR values. This combining step may be computed following the critical path, and in a postprocessor.

Flow diagram 1500 can be used to decode a combined signal vector that advantageously pulls out as much computation as possible from the critical path. The computations are instead performed by preprocessors at step 1502 or by postprocessors, at step 1508. Thus, computations that are repeatedly performed may have low-complexity and/or may be efficient.

Referring now to FIG. 16, flow diagram 1600 shows a more detailed, yet still simplified, illustration of decoding a combined signal vector in a 2×2 MIMO system in accordance with the decoding strategy of flow diagram 1500 (FIG. 15). At step 1602, calculations involved for determining {circumflex over (L)}⁻¹ and {tilde over (L)}* are computed. For a 2×2 system, where

${{\hat{L}}^{- 1} = {{\begin{bmatrix} \sqrt{h_{11}^{(2)}} & 0 \\ {- h_{12}^{*}} & h_{11} \end{bmatrix}\mspace{14mu}{and}\mspace{14mu}{\overset{\sim}{L}}^{*}} = \begin{bmatrix} h_{11} & 0 \\ h_{12}^{*} & \sqrt{h_{11}^{(2)}} \end{bmatrix}}},$ step 1602 may first involve determining h₁₁ ⁽²⁾=h₁₁h₂₂−h₁₂h*₁₂, and may then involve determining its square root, √{square root over (h₁₁ ⁽²⁾)}. These values may be determined by a channel preprocessor (e.g., preprocessor 1400 (FIG. 14)). At step 1604, a combined received signal vector, {tilde over (y)}, may be processed by multiplying the vector by {circumflex over (L)}⁻¹. The combined received signal vector may be obtained using MRC or any other suitable combining method, such as another form of weighted addition. The combined received signal vector may be obtained from a signal vector combiner, such as MRC combiner 1402 in FIG. 14. The multiplication by {circumflex over (L)}⁻¹ may be performed by a signal processor, such as signal processor 1412 in FIG. 14.

At step 1606, a simplified decoding metric, may be calculated for each possible combination of X. For a 2×2 system, the simplified decoding metric may be {tilde over (D)}=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ⁽²⁾)}{tilde over (L)}*X∥². Thus, at step 1606, √{square root over (h₁₁ ⁽²⁾)}{tilde over (L)}* may be multiplied by each valid common transmit signal vector, X, and the result from each multiplication may be used to determine the simplified decoding metric. Alternatively, the decoding metric may be a linear approximation of the simplified decoding metric, {tilde over (D)}_(linear)=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ⁽²⁾)}{tilde over (L)}*X∥. Step 1606 may therefore involve computing a suitable decoding metric many times (e.g., 4096 times for a 2×2, 64-QAM system, or 64 times for each symbol). Step 1606 may be performed by a maximum-likelihood decoder, such as by ML decoder 1404 in FIG. 14.

After calculating the decoding metric for each possible X, the minimizing values for b=1 and b=0 are used to determine a simplified LLR at step 1608. As described above, the simplified LLR may be determined by computing,

$\begin{matrix} {{{LLR}^{\prime} = {{- {\min\limits_{b = 0}\left\{ \overset{\sim}{D} \right\}}} + {\min\limits_{b = 1}\left\{ \overset{\sim}{D} \right\}}}},} & (98) \end{matrix}$ or LLR′_(linear). The simplified LLR may be computed by a maximum-likelihood decoder, such as by ML decoder 1404 in FIG. 14. At step 1612, the simplified LLR may be modified by a factor to compute the true LLR. In the 2×2 case, the factor may be

${\frac{1}{h_{11}h_{11}^{(2)}}\mspace{14mu}{or}\mspace{14mu}\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}},$ depending on which decoding metric is used. This factor may be determined by step 1610.

Step 1610 may be executed while steps 1604, 1606, and 1608 are being executed. Namely, step 1610 may be computed at any time while steps 1604, 1606 and 1608 are computed. Alternatively, step 1610 may be computed some time before or some time after the other steps. Step 1610 involves performing calculations that are not used by steps 1604, 1606, and 1608, but are used to compute the final LLR value. Thus, step 1610 may perform any suitable calculations that are used in calculations after the critical path (e.g., step 1612). For a 2×2 system, step 1610 may involve computing h₁₁h₁₁ ⁽²⁾, and using the result to compute

$\frac{1}{h_{11}h_{11}^{(2)}}.$ Alternatively, step 1610 may involve computing √{square root over (h₁₁)}, then using the result to compute √{square root over (h₁₁)}√{square root over (h₁₁ ⁽²⁾)}, and finally computing

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}.$ Recall that √{square root over (h₁₁ ⁽²⁾)} has already been computed at step 1602. Thus, √{square root over (h₁₁)} may be computed using the same hardware, if applicable, as the hardware used to compute √{square root over (h₁₁ ⁽²⁾)}.

$\frac{1}{h_{11}h_{11}^{(2)}}\mspace{14mu}{or}\mspace{14mu}\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}}$ may be used by step 1610 to compute the final LLR, as described above. Step 1610 may be computed by a channel processor, such as preprocessor 1400 in FIG. 14.

The Cholesky factorization and decoding examples given above in connection with equations (76) through (95) FIGS. 15 and 16 have been for 2-input, 2-input MIMO systems. It should be understood, however, that the Cholesky factorization may be applied to any R-input, R-output MIMO system, and flow diagrams 1500 and 1600 may be utilized for any R-input, T-output MIMO system. To illustrate the above-described aspect of the present invention further, a full example for 3×3 {tilde over (H)} will be described below in connection with FIG. 17 and equations (99) through (120).

A Cholesky factorization for a 3×3 combined channel matrix, {tilde over (H)}, is described herein. The components of {tilde over (H)} may be represented by h₁₁, h₁₂, h*₁₂, h₁₃, h*₁₃, h₂₂, h*₂₂, h₃₃, and h*₂₂. Thus, the first matrix, A⁽¹⁾, may be given by,

$\begin{matrix} {A^{(1)} = {\overset{\sim}{H} = {\begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{12}^{*} & h_{22} & h_{23} \\ h_{13}^{*} & h_{23}^{*} & h_{33} \end{bmatrix}.}}} & (99) \end{matrix}$ In accordance with equation (73), variables a⁽¹⁾, b⁽¹⁾, and B⁽¹⁾ may take on the following values:

$\begin{matrix} {{a^{(1)} = h_{11}},} & (100) \\ {{b^{(1)} = \begin{bmatrix} h_{12}^{*} \\ h_{13}^{*} \end{bmatrix}},\mspace{14mu}{and}} & (101) \\ {B^{(1)} = {\begin{bmatrix} h_{22} & h_{23} \\ h_{23}^{*} & h_{33} \end{bmatrix}.}} & (102) \end{matrix}$ The first step in the Cholesky recursive algorithm involves determining A⁽²⁾ and L₁ using equations (74) and (75), respectively. Accordingly, A⁽²⁾ and L₁ may be given by,

$\begin{matrix} \begin{matrix} {A^{(2)} = \begin{bmatrix} I_{1} & 0 \\ 0 & {B^{(1)} - {\frac{1}{h_{11}}b^{(1)}b^{{(1)}*}}} \end{bmatrix}} \\ {= {\begin{bmatrix} 1 & 0 & 0 \\ 0 & {\frac{1}{h_{11}}h_{11}^{(2)}} & {\frac{1}{h_{11}}h_{12}^{(2)}} \\ 0 & {\frac{1}{h_{11}}h_{12}^{{(2)}*}} & {\frac{1}{h_{11}}h_{22}^{(2)}} \end{bmatrix}\mspace{14mu}{and}}} \end{matrix} & (103) \\ {{L_{1} = {\begin{bmatrix} \sqrt{a^{(1)}} & 0 \\ {\frac{1}{\sqrt{a^{(1)}}}b^{(1)}} & I_{2} \end{bmatrix} = \begin{bmatrix} \sqrt{h_{11}} & 0 & 0 \\ \frac{h_{12}^{*}}{\sqrt{h_{11}}} & 1 & 0 \\ \frac{h_{13}^{*}}{\sqrt{h_{11}}} & 0 & 1 \end{bmatrix}}},} & (104) \end{matrix}$ where h₁₁ ⁽²⁾=h₁₁h₂₂−h*₁₂h₁₂, h₁₂ ⁽²⁾=h₁₁h₂₃−h*₁₂h₁₃, and h₂₂ ⁽²⁾=h₁₁h₃₃−h*₁₃h₁₃.

After determining A⁽²⁾, the second step in the Cholesky algorithm involves determining A⁽³⁾ and L₂ using equations (74) and (75) once again. First, from equation (73), variables a⁽²⁾, b⁽²⁾, and B⁽²⁾ may take on the following values:

$\begin{matrix} {{a^{(2)} = {\frac{1}{h_{11}}h_{11}^{(2)}}},} & (105) \\ {{b^{(2)} = {\frac{1}{h_{11}}h_{12}^{{(2)}^{*}}}},{and}} & (106) \\ {B^{(2)} = {\frac{1}{h_{11}}{h_{22}^{(2)}.}}} & (107) \end{matrix}$ Therefore, A⁽³⁾ and L₂ may be given by,

$\begin{matrix} {\begin{matrix} {A^{(3)} = \begin{bmatrix} I_{2} & 0 \\ 0 & {B^{(2)} - {\frac{1}{\frac{1}{h_{11}}h_{11}^{(2)}}b^{(2)}b^{{(2)}^{*}}}} \end{bmatrix}} \\ {= \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & {\frac{1}{h_{11}}\frac{1}{h_{11}^{(2)}}\left( {{h_{11}^{(2)}h_{22}^{(2)}} - {h_{12}^{{(2)}^{*}}h_{12}^{(2)}}} \right)} \end{bmatrix}} \\ {{\equiv \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & {\frac{1}{h_{11}}\frac{1}{h_{11}^{(2)}}\left( h_{11}^{(3)} \right)} \end{bmatrix}},(109)} \end{matrix}{and}} & (108) \\ {{L_{2} = {\begin{bmatrix} I_{1} & 0 & 0 \\ 0 & \sqrt{a^{(2)}} & 0 \\ 0 & {\sqrt{a^{(2)}}b^{(2)}} & I_{1} \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & {\frac{1}{\sqrt{h_{11}}}\sqrt{h_{11}^{(2)}}} & 0 \\ 0 & {\frac{1}{\sqrt{h_{11}}}\frac{h_{12}^{{(2)}^{*}}}{\sqrt{h_{11}^{(2)}}}} & 1 \end{bmatrix}}},} & (110) \end{matrix}$ where h₁₁ ⁽³⁾=h₁₁ ⁽²⁾h₂₂ ⁽²⁾−h₁₂ ⁽²⁾*h₁₂ ⁽²⁾.

After determining A⁽³⁾, the third and final step in the Cholesky algorithm involves calculating A⁽⁴⁾ and L₃. In accordance with equation (73), A⁽³⁾ may be written as,

$\begin{matrix} {H^{(3)} = {\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & {\frac{1}{h_{11}}\frac{1}{h_{11}^{(2)}}\left( h_{11}^{(3)} \right)} \end{bmatrix} = {\begin{bmatrix} I_{2} & 0 \\ 0 & a^{(3)} \end{bmatrix}.}}} & (111) \end{matrix}$ Thus, A⁽⁴⁾ and L₃ may be expressed as,

$\begin{matrix} {{A^{(4)} = {\left\lbrack I_{3} \right\rbrack = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}}},{and}} & (112) \\ {L_{3} = {\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & \sqrt{a^{(3)}} \end{bmatrix} = {\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & {\frac{1}{\sqrt{h_{11}}}\frac{1}{\sqrt{h_{11}^{(2)}}}\sqrt{h_{11}^{(3)}}} \end{bmatrix}.}}} & (113) \end{matrix}$ As expected at the final step of the Cholesky algorithm, the matrix A⁽⁴⁾=A^((R+1)), is the identity matrix.

The lower triangular matrix, L, where {tilde over (H)}=LL*, may be determined following the recursive algorithm described above. In general, L is determined by multiplying L₁, . . . , L_(P). Thus, for the 3×3 case, L may be calculated by multiplying L₁, L₂, and L₃. Thus,

$\begin{matrix} \begin{matrix} {L = {L_{1}L_{2}L_{3}}} \\ {= {\frac{1}{\sqrt{h_{11}}}{\frac{1}{\sqrt{h_{11}^{(2)}}}\begin{bmatrix} {\sqrt{h_{11}^{(2)}}h_{11}} & 0 & 0 \\ {\sqrt{h_{11}^{(2)}}h_{12}^{*}} & h_{11}^{(2)} & 0 \\ {\sqrt{h_{11}^{(2)}}h_{13}^{*}} & h_{12}^{{(2)}^{*}} & \sqrt{h_{11}^{(3)}} \end{bmatrix}}}} \\ {{\equiv {\frac{1}{\sqrt{h_{11}}}{\frac{1}{h_{11}^{(2)}} \cdot \overset{\sim}{L}}}},(115)} \end{matrix} & (114) \end{matrix}$ where

$\begin{matrix} {\overset{\sim}{L} = {\begin{bmatrix} {\sqrt{h_{11}^{(2)}}h_{11}} & 0 & 0 \\ {\sqrt{h_{11}^{(2)}}h_{12}^{*}} & h_{11}^{(2)} & 0 \\ {\sqrt{h_{11}^{(2)}}h_{13}^{*}} & h_{12}^{{(2)}^{*}} & \sqrt{h_{11}^{(3)}} \end{bmatrix}.}} & (116) \end{matrix}$ The inverse of L, or L⁻¹, may also be calculated by computing the inverses of L₁, L₂, L₃, and multiplying them in reverse order. That is,

$\begin{matrix} {L^{- 1} = {L_{3}^{- 1}L_{2}^{- 1}L_{1}^{- 1}}} & {{~~~~~~~~~}(116)} \\ {= {\frac{1}{\sqrt{h_{11}}}\frac{1}{\sqrt{h_{11}^{(2)}}}{\frac{1}{\sqrt{h_{11}^{(3)}}}\begin{bmatrix} {\sqrt{h_{11}^{(2)}}\sqrt{h_{11}^{(3)}}} & 0 & 0 \\ {{- h_{12}^{*}}\sqrt{h_{11}^{(3)}}} & {h_{11}\sqrt{h_{11}^{(3)}}} & 0 \\ \begin{matrix} {{{- h_{13}^{*}}h_{11}^{(2)}} +} \\ {h_{12}^{*}h_{12}^{{(2)}^{*}}} \end{matrix} & {{- h_{11}}h_{12}^{(2)}} & {h_{11}h_{11}^{(2)}} \end{bmatrix}}}} & {(117)} \\ {= {\frac{1}{\sqrt{h_{11}}}\frac{1}{\sqrt{h_{11}^{(2)}}}\frac{1}{\sqrt{h_{11}^{(3)}}}{{\hat{L}}^{- 1}.}}} & {(118)} \end{matrix}$ where

$\begin{matrix} {{\hat{L}}^{- 1} = {\begin{bmatrix} {\sqrt{h_{11}^{(2)}}\sqrt{h_{11}^{(3)}}} & 0 & 0 \\ {{- h_{12}^{*}}\sqrt{h_{11}^{(3)}}} & {h_{11}\sqrt{h_{11}^{(3)}}} & 0 \\ {{{- h_{13}^{*}}h_{11}^{(2)}} + {h_{12}^{*}h_{12}^{{(2)}^{*}}}} & {{- h_{11}}h_{12}^{(2)}} & {h_{11}h_{11}^{(2)}} \end{bmatrix}.}} & (119) \end{matrix}$

If the combined channel matrix is 3×3, a preprocessor may compute the 3×3 Cholesky algorithm, as described above in connection with equations (99) through (119). Alternatively, a preprocessor may have equations for one or more factorizations, or equivalent representations of the equations, hard-coded or hard-wired. For example, the preprocessor may hard-code or hard-wire equations (114) and (117).

FIG. 17 shows illustrative flow diagram 1700 for decoding a combined received signal vector from a 3×3 MIMO system in accordance with the decoding strategy of flow diagram 1500 (FIG. 15). At step 1702, processing is performed on components of the combined channel response matrix that will be used to calculate a simplified decoding metric. In particular, processing may be performed to determine the {tilde over (L)} and {circumflex over (L)}⁻¹ matrices, shown in equations (116) and (119). First, h₁₁ ⁽²⁾=h₁₁h₂₂−h*₁₂h₁₂, h₁₂ ⁽²⁾=h₁₁h₂₃−h*₁₂h₁₃, and h₂₂ ⁽²⁾=h₁₁h₃₃−h*₁₃h₁₃, defined in the first step of the Cholesky factorization, may be determined. Using the results, h₁₁ ⁽³⁾=h₁₁ ⁽²⁾h₂₂ ⁽²⁾−h₁₂ ⁽²⁾*h₁₂ ⁽²⁾, defined in the second stop of the Cholesky factorization, may be calculated. Also, the square root of h₁₁ ⁽²⁾, √{square root over (h₁₁ ⁽²⁾)}, may be calculated in parallel. Following the determination of h₁₁ ⁽³⁾, the square root of h₁₁ ⁽³⁾, √{square root over (h₁₁ ⁽³⁾)}, may also be calculated. In some embodiments, the square root circuitry (if applicable) used to calculate √{square root over (h₁₁ ⁽²⁾)} may be used to calculate √{square root over (h₁₁ ⁽³⁾)}. Finally, using the results of the above calculations, the {tilde over (L)} and {circumflex over (L)}⁻¹ matrices may be constructed. Namely, the non-zero components of {tilde over (L)} and {circumflex over (L)}⁻¹ (which are √{square root over (h₁₁ ⁽²⁾)}√{square root over (h₁₁ ⁽³⁾)}, −h*₁₂√{square root over (h₁₁ ⁽³⁾)}, h₁₁√{square root over (h₁₁ ⁽³⁾)}, −h*₁₃h₁₁ ⁽²⁾+h*₁₂h₁₂ ⁽²⁾*, −h₁₁h₁₂ ⁽²⁾, and h₁₁h₁₁ ⁽²⁾) may be calculated. Note that no division operation is required in any of the above calculations. For at least this reason, the calculations performed in step 1702 may be considerably less complex than any channel processing that would have been necessary using the original decoding metric. In some embodiments, the channel processing calculations described above may be performed by a channel preprocessor (e.g., preprocessor 1400 in FIG. 14).

At step 1704, a combined received signal vector {tilde over (y)}, may be processed by multiplying the vector by {circumflex over (L)}⁻¹, determined from step 1702. The combined received signal vector may be obtained using MRC or any other suitable combining method, such as another form of weighted addition. The combined received signal vector may be obtained from a signal vector combiner, such as MRC combiner 1402 in FIG. 14. The multiplication by {circumflex over (L)}⁻¹ may be performed by a signal processor, such as signal processor 1412 in FIG. 14.

At step 1706, a simplified decoding metric may be calculated for each possible combination of X. For a 3×3 system, the simplified decoding metric may be {tilde over (D)}=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ⁽³⁾)}{tilde over (L)}*X∥², where h₁₁ ⁽³⁾=h₁₁ ⁽²⁾h₂₂ ⁽²⁾−h₁₂ ⁽²⁾*h₁₂ ⁽²⁾ and {tilde over (L)} and {circumflex over (L)}⁻¹ are given by equations (116) and (119), respectively. Thus, at step 1706, √{square root over (h₁₁ ⁽³⁾)}{tilde over (L)}* may be multiplied by each valid common transmit signal vector, X, and the result from each multiplication may be used to determine the simplified decoding metric. Alternatively, the decoding metric may be a linear approximation of the simplified decoding metric, {tilde over (D)}_(linear)=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ⁽³⁾)}{tilde over (L)}*X∥. Step 1706 may therefore involve computing a suitable decoding metric many times (e.g., 64×64×64=262,144 times for a 3×3, 64-QAM system, or 64 times for each symbol). Step 1706 may be performed by a maximum-likelihood decoder, such as by ML decoder 1404 in FIG. 14.

After calculating the decoding metric for each possible X, the minimizing values for b=1 and b=0 are used to determine a simplified LLR at step 1708. The simplified LLR may be determined by computing,

$\begin{matrix} {{{LLR}^{\prime} = {{- {\min\limits_{b = 0}\left\{ \overset{\sim}{D} \right\}}} + {\min\limits_{b = 1}\left\{ \overset{\sim}{D} \right\}}}},} & (120) \end{matrix}$ or LLR′_(linear). The simplified LLR may be computed by a maximum-likelihood decoder, such as by ML decoder 1404 in FIG. 14. At step 1712, simplified LLR may be modified by a factor to compute the true LLR. In the 3×3 case, the factor may be

${\frac{1}{h_{11}h_{11}^{(2)}h_{11}^{(3)}}\mspace{14mu}{or}\mspace{14mu}\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}\sqrt{h_{11}^{(3)}}}},$ depending on which decoding metric is used. This factor may be determined at step 1710.

Step 1710 may be executed while steps 1704, 1706, and 1708 are being executed. Namely, step 1710 may be computed at any time while steps 1704, 1706 and 1708 are computed. Alternatively, step 1710 may be computed some time before or some time after the other steps. Step 1710 involves performing calculations that are not used by steps 1704, 1706, and 1708, but are used to compute the final LLR value. Thus, step 1710 may perform any suitable calculations that are used in calculations after the critical path (e.g., step 1712). For a 3×3 system, step 1710 may involve computing h₁₁h₁₁ ⁽²⁾h₁₁ ⁽³⁾, and using the result to compute

$\frac{1}{h_{11}h_{11}^{(2)}h_{11}^{(3)}}.$ Alternatively, step 1710 may involve computing √{square root over (h₁₁)}, then using the result to compute √{square root over (h₁₁)}√{square root over (h₁₁ ⁽²⁾)}√{square root over (h₁₁ ⁽³⁾)}, and finally computing

$\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}\sqrt{h_{11}^{(3)}}}.$ Recall that √{square root over (h₁₁ ⁽²⁾)} and √{square root over (h₁₁ ⁽³⁾)} have already been computed at step 1702. √{square root over (h₁₁)} may therefore be computed using the same hardware, if applicable, as the hardware used to compute √{square root over (h₁₁ ⁽²⁾)} and/or √{square root over (h₁₁ ⁽³⁾)}.

$\frac{1}{h_{11}h_{11}^{(2)}h_{11}^{(3)}}\mspace{14mu}{or}\mspace{14mu}\frac{1}{\sqrt{h_{11}}\sqrt{h_{11}^{(2)}}\sqrt{h_{11}^{(3)}}}$ may be used by step 1710 to compute the final LLR, as described above. Step 1710 may be computed by a channel processor, such as preprocessor 1400 in FIG. 14.

As previously discussed above in the 2×2 example, the decoding implementation shown above has many advantages. Firstly, the division operation is left out of the critical path, and may be performed at substantially the same time as the critical path calculations. Therefore, the division operation may be implemented using a slow, but low-complexity algorithm, such as a serial inversion mechanism. Furthermore, the square root operations are left out of the critical path, which may again allow a receiver designer to lower the complexity of the square root implementations.

Secondly, if the linear simplified decoding metric is used, the decoding may be symbol-based. That is, the decoder may output estimates of each symbol rather than the entire signal vector. If hard-decisions are used, the simplified LLR determined symbol-by-symbol is sufficient to map each symbol to a hard decision. Thus, the modifier is no longer needed, and steps 1710 and 1712 may be completely omitted. Therefore, division operations are not necessary, nor are any final multipliers to compute the true LLR.

Generally, a decoding metric with Cholesky factorization, ∥L⁻¹{tilde over (y)}_(N)−L*x∥², for an R×R MIMO system may be factored into a squared, simplified decoding metric, {tilde over (D)}=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ^((R)))}{tilde over (L)}*X∥², and modifier,

$\frac{1}{\prod\limits_{i = 1}^{R}\;\sqrt{h_{11}^{(R)}}},$ where h₁₁ ⁽¹⁾=h₁₁. Alternatively, the decoding metric may be factored into a linear, simplified decoding metric, {tilde over (D)}=∥{circumflex over (L)}⁻¹Y−√{square root over (h₁₁ ^((R)))}{tilde over (L)}*X∥, and modifier,

$\frac{1}{\prod\limits_{i = 1}^{R}\;\sqrt{h_{11}^{(R)}}},$ where h₁₁ ⁽¹⁾=h₁₁. Derivations of the equations for 2×2 and 3×3 MIMO systems were given above. Decoding of a signal vector for a general R-input, R-output MIMO system may be performed using the steps shown in FIG. 15, and may have any of the features described above in connection with the 2×2 and 3×3 examples of FIG. 16 and FIG. 17.

Referring now to FIGS. 18A-18G, various exemplary implementations of the present invention are shown.

Referring now to FIG. 18A, the present invention can be implemented in a hard disk drive 1800. The present invention may implement either or both signal processing and/or control circuits, which are generally identified in FIG. 18A at 1802. In some implementations, the signal processing and/or control circuit 1802 and/or other circuits (not shown) in the HDD 1800 may process data, perform coding and/or encryption, perform calculations, and/or format data that is output to and/or received from a magnetic storage medium 1806.

The HDD 1800 may communicate with a host device (not shown) such as a computer, mobile computing devices such as personal digital assistants, cellular phones, media or MP3 players and the like, and/or other devices via one or more wired or wireless communication links 1808. The HDD 1800 may be connected to memory 1809 such as random access memory (RAM), low latency nonvolatile memory such as flash memory, read only memory (ROM) and/or other suitable electronic data storage.

Referring now to FIG. 18B, the present invention can be implemented in a digital versatile disc (DVD) drive 1810. The present invention may implement either or both signal processing and/or control circuits, which are generally identified in FIG. 18B at 1812, and/or mass data storage of the DVD drive 1810. The signal processing and/or control circuit 1812 and/or other circuits (not shown) in the DVD 1810 may process data, perform coding and/or encryption, perform calculations, and/or format data that is read from and/or data written to an optical storage medium 1816. In some implementations, the signal processing and/or control circuit 1812 and/or other circuits (not shown) in the DVD 1810 can also perform other functions such as encoding and/or decoding and/or any other signal processing functions associated with a DVD drive.

The DVD drive 1810 may communicate with an output device (not shown) such as a computer, television or other device via one or more wired or wireless communication links 1817. The DVD 1810 may communicate with mass data storage 1818 that stores data in a nonvolatile manner. The mass data storage 1818 may include a hard disk drive (HDD). The HDD may have the configuration shown in FIG. 18A. The HDD may be a mini HDD that includes one or more platters having a diameter that is smaller than approximately 1.8″. The DVD 1810 may be connected to memory 1819 such as RAM, ROM, low latency nonvolatile memory such as flash memory and/or other suitable electronic data storage.

Referring now to FIG. 18C, the present invention can be implemented in a high definition television (HDTV) 1820. The present invention may implement either or both signal processing and/or control circuits, which are generally identified in FIG. 18C at 1822, a wireless local area network (WLAN) interface and/or mass data storage of the HDTV 1820. The HDTV 1820 receives HDTV input signals in either a wired or wireless format and generates HDTV output signals for a display 1826. In some implementations, signal processing circuit and/or control circuit 1822 and/or other circuits (not shown) of the HDTV 1820 may process data, perform coding and/or encryption, perform calculations, format data and/or perform any other type of HDTV processing that may be required.

The HDTV 1820 may communicate with mass data storage 1827 that stores data in a nonvolatile manner such as optical and/or magnetic storage devices for example hard disk drives HDD and/or DVDs. At least one HDD may have the configuration shown in FIG. 18A and/or at least one DVD may have the configuration shown in FIG. 18B. The HDD may be a mini HDD that includes one or more platters having a diameter that is smaller than approximately 1.8″. The HDTV 1820 may be connected to memory 1828 such as RAM, ROM, low latency nonvolatile memory such as flash memory and/or other suitable electronic data storage. The HDTV 1820 also may support connections with a WLAN via a WLAN network interface 1829.

Referring now to FIG. 18D, the present invention implements a control system of a vehicle 1830, a WLAN interface and/or mass data storage of the vehicle control system. In some implementations, the present invention may implement a powertrain control system 1832 that receives inputs from one or more sensors 1836, such as temperature sensors, pressure sensors, rotational sensors, airflow sensors and/or any other suitable sensors, and/or that generates one or more output control signals such as engine operating parameters, transmission operating parameters, and/or other control signals, which may be provided to one or more output devices 1838.

The present invention may also be implemented in other control systems 1840 of the vehicle 1830. The control system 1840 may likewise receive signals from input sensors 1842 and/or output control signals to one or more output devices 1844. In some implementations, the control system 1840 may be part of an anti-lock braking system (ABS), a navigation system, a telematics system, a vehicle telematics system, a lane departure system, an adaptive cruise control system, a vehicle entertainment system such as a stereo, DVD, compact disc and the like. Still other implementations are contemplated.

The powertrain control system 1832 may communicate with mass data storage 1846 that stores data in a nonvolatile manner. The mass data storage 1846 may include optical and/or magnetic storage devices for example hard disk drives HDD and/or DVDs. At least one HDD may have the configuration shown in FIG. 18A and/or at least one DVD may have the configuration shown in FIG. 18B. The HDD may be a mini HDD that includes one or more platters having a diameter that is smaller than approximately 1.8″. The powertrain control system 1832 may be connected to memory 1847 such as RAM, ROM, low latency nonvolatile memory such as flash memory and/or other suitable electronic data storage. The powertrain control system 1832 also may support connections with a WLAN via a WLAN network interface 1848. The control system 1840 may also include mass data storage, memory and/or a WLAN interface (all not shown).

Referring now to FIG. 18E, the present invention can be implemented in a cellular phone 1850 that may include a cellular antenna 1851. The present invention may implement either or both signal processing and/or control circuits, which are generally identified in FIG. 18E at 1852, a WLAN interface and/or mass data storage of the cellular phone 1850. In some implementations, the cellular phone 1850 includes a microphone 1856, an audio output 1858 such as a speaker and/or audio output jack, a display 1860 and/or an input device 1862 such as a keypad, pointing device, voice actuation and/or other input device. The signal processing and/or control circuits 1852 and/or other circuits (not shown) in the cellular phone 1850 may process data, perform coding and/or encryption, perform calculations, format data and/or perform other cellular phone functions.

The cellular phone 1850 may communicate with mass data storage 1864 that stores data in a nonvolatile manner such as optical and/or magnetic storage devices for example hard disk drives HDD and/or DVDs. At least one HDD may have the configuration shown in FIG. 18A and/or at least one DVD may have the configuration shown in FIG. 18B. The HDD may be a mini HDD that includes one or more platters having a diameter that is smaller than approximately 1.8″. The cellular phone 1850 may be connected to memory 1866 such as RAM, ROM, low latency nonvolatile memory such as flash memory and/or other suitable electronic data storage. The cellular phone 1850 also may support connections with a WLAN via a WLAN network interface 1868.

Referring now to FIG. 18F, the present invention can be implemented in a set top box 1880. The present invention may implement either or both signal processing and/or control circuits, which are generally identified in FIG. 18F at 1884, a WLAN interface and/or mass data storage of the set top box 1880. The set top box 1880 receives signals from a source such as a broadband source and outputs standard and/or high definition audio/video signals suitable for a display 1888 such as a television and/or monitor and/or other video and/or audio output devices. The signal processing and/or control circuits 1884 and/or other circuits (not shown) of the set top box 1880 may process data, perform coding and/or encryption, perform calculations, format data and/or perform any other set top box function.

The set top box 1880 may communicate with mass data storage 1890 that stores data in a nonvolatile manner. The mass data storage 1890 may include optical and/or magnetic storage devices for example hard disk drives HDD and/or DVDs. At least one HDD may have the configuration shown in FIG. 18A and/or at least one DVD may have the configuration shown in FIG. 18B. The HDD may be a mini HDD that includes one or more platters having a diameter that is smaller than approximately 1.8″. The set top box 1880 may be connected to memory 1894 such as RAM, ROM, low latency nonvolatile memory such as flash memory and/or other suitable electronic data storage. The set top box 1880 also may support connections with a WLAN via a WLAN network interface 1896.

Referring now to FIG. 18G, the present invention can be implemented in a media player 1960. The present invention may implement either or both signal processing and/or control circuits, which are generally identified in FIG. 18G at 1904, a WLAN interface and/or mass data storage of the media player 1900. In some implementations, the media player 1900 includes a display 1907 and/or a user input 1908 such as a keypad, touchpad and the like. In some implementations, the media player 1900 may employ a graphical user interface (GUI) that typically employs menus, drop down menus, icons and/or a point-and-click interface via the display 1907 and/or user input 1908. The media player 1900 further includes an audio output 1909 such as a speaker and/or audio output jack. The signal processing and/or control circuits 1904 and/or other circuits (not shown) of the media player 1900 may process data, perform coding and/or encryption, perform calculations, format data and/or perform any other media player function.

The media player 1900 may communicate with mass data storage 1910 that stores data such as compressed audio and/or video content in a nonvolatile manner. In some implementations, the compressed audio files include files that are compliant with MP3 format or other suitable compressed audio and/or video formats. The mass data storage may include optical and/or magnetic storage devices for example hard disk drives HDD and/or DVDs. At least one HDD may have the configuration shown in FIG. 18A and/or at least one DVD may have the configuration shown in FIG. 18B. The HDD may be a mini HDD that includes one or more platters having a diameter that is smaller than approximately 1.8″. The media player 1900 may be connected to memory 1914 such as RAM, ROM, low latency nonvolatile memory such as flash memory and/or other suitable electronic data storage. The media player 1900 also may support connections with a WLAN via a WLAN network interface 1916. Still other implementations in addition to those described above are contemplated.

The foregoing describes systems and methods for decoding a signal vector, where the receiver may obtain receive multiple instances of the same transmit signal vector. The above described embodiments of the present invention are presented for the purposes of illustration and not of limitation. Furthermore, the present invention is not limited to a particular implementation. The invention may be implemented in hardware, such as on an application specific integrated circuit (ASIC) or on a field-programmable gate array (FPGA). The invention may also be implement in software. 

1. A method for decoding a signal vector in a multiple-input multiple-output transmission scheme, comprising: receiving multiple signal vectors corresponding to a common transmit signal vector, wherein each of the received signal vectors is associated with a channel response matrix; combining the multiple received signal vectors into a combined received signal vector; combining the channel response matrices into a combined channel response matrix; processing the combined received signal vector using a noise whitening function derived from the combined channel response matrix; and decoding the processed combined received signal vector based on the combined channel response matrix.
 2. The method of claim 1, wherein combining the multiple received signal vectors into a combined received signal vector comprises multiplying each of the received signal vectors by a conjugate transpose of a respective channel response matrix and summing the results.
 3. The method of claim 2, wherein combining the channel response matrices into a combined channel response matrix comprises multiplying each of the channel response matrices by a conjugate transpose of a respective channel response matrix and summing the results.
 4. The method of claim 1, wherein processing the combined received signal vector using a noise whitening function derived from the combined channel response matrix comprises multiplying the combined received signal vector by a processed version of the combined channel response matrix.
 5. The method of claim 4, wherein the processed version of the combined channel response matrix is produced by calculating a square root of the combined channel response matrix.
 6. The method of claim 1, wherein decoding the processed combined received signal vector comprises calculating the metric x*_(N){tilde over (H)}_(N)x_(N)−2R{x*_(N){tilde over (y)}_(N)}, where x_(N) is the common transmit signal vector, x*_(N) is a conjugate transpose of x_(N), {tilde over (H)}_(N) is the combined channel response matrix, and {tilde over (y)}_(N) is the combined received signal vector.
 7. The method of claim 1 further comprising performing QR decomposition on a square root of the combined channel response matrix.
 8. The method of claim 7, wherein decoding the processed combined received signal vector comprises calculating a metric ∥Q*R⁻¹Q*{tilde over (y)}_(N)−Rx∥² using decomposed matrices Q and R, where x is the common transmit signal vector and {tilde over (y)}_(N) is the combined received signal vector.
 9. The method of claim 1 further comprising performing Cholesky factorization on the combined channel response matrix.
 10. The method of claim 9, wherein decoding the processed combined received signal vector comprises calculating a metric ∥L⁻¹{tilde over (y)}_(N)−L*x∥², where x is the common transmit signal vector, {tilde over (y)}_(N) is the combined received signal vector, and L is a matrix produced by the Cholesky factorization.
 11. A system for decoding a signal vector in a multiple-input multiple-output transmission scheme, comprising: a receiver configured to receive multiple signal vectors corresponding to a common transmit signal vector, wherein each of the received signal vectors is associated with a channel response matrix; a vector combiner configured to combine the multiple received signal vectors into a combined received signal vector; a matrix combiner configured to combine the channel response matrices into a combined channel response matrix; a signal processor configured to process the combined received signal vector using a noise whitening function derived from the combined channel response matrix; and a decoder configured to decode the processed combined received signal vector based on the combined channel response matrix.
 12. The system of claim 11, wherein the vector combiner combines the multiple received signal vectors by multiplying each of the received signal vectors by a conjugate transpose of a respective channel response matrix and summing the results.
 13. The system of claim 12, wherein the matrix combiner combines the channel response matrices by multiplying each of the channel response matrices by a conjugate transpose of a respective channel response matrix and summing the results.
 14. The system of claim 11, wherein the signal processor processes the combined received signal vector by multiplying the combined received signal vector by a processed version of the combined channel response matrix.
 15. The system of claim 14, wherein the signal processor produces the processed version of the combined channel response matrix by calculating a square root of the combined channel response matrix.
 16. The system of claim 11, wherein the decoder decodes the processed combined received signal vector by calculating the metric x*_(N){tilde over (H)}_(N)x_(N)−2R(x*_(N){tilde over (y)}_(N)), where x_(k) is the common transmit signal vector, x*_(N)is a conjugate transpose of x_(N), {tilde over (H)}_(N) is the combined channel response matrix, and {tilde over (y)}_(N) is the combined received signal vector.
 17. The system of claim 11, wherein the decoder is further configured to perform QR decomposition on a square root of the combined channel response matrix.
 18. The system of claim 17, wherein the decoder decodes the processed combined received signal vector by calculating a metric ∥Q*R⁻¹Q*{tilde over (y)}_(N)−Rx∥² using decomposed matrices Q and R, where x is the common transmit signal vector and {tilde over (y)}_(N) is the combined received signal vector.
 19. The system of claim 11, wherein the decoder is further configured to perform Cholesky factorization on the combined channel response matrix.
 20. The system of claim 19, wherein the decoder decodes the processed combined received signal vector by calculating a metric ∥L⁻¹{tilde over (y)}_(N)−L*x∥², where x is the common transmit signal vector, {tilde over (y)}_(N) is the combined received signal vector, and L is a matrix produced by the Cholesky factorization. 